IGF 2019 – Day 4 – Raum II - WS 267 A Tutorial on Public Policy Essentials of Data

https://www.intgovforum.org/content/igf-2019-ws-267-a-tutorial-on-public-policy…

The following are the outputs of the real-time captioning taken during the Fourteenth Annual Meeting of the Internet Governance Forum (IGF) in Berlin, Germany, from 25 to 29 November 2019. Although it is largely accurate, in some cases it may be incomplete or inaccurate due to inaudible passages or transcription errors. It is posted as an aid to understanding the proceedings at the event, but should not be treated as an authoritative record.

***

>> MODERATOR: All right. Good morning, everyone. Thank you so much for joining us on the last day of IGF and thank you so much for making the time for a 9:30 a.m. workshop.

So, you're here today for a tutorial on public policy essentials of data governance. I think it's a bit ‑‑ it's ironic to have a tutorial on the fifth day of IGF, when essentially, I think 90% of the workshops here have been on data governance, so we might just call this a reflection session. So, you know, repurpose it a little bit.

Since I assume a lot of you are interested in the topic of data governance and want to enjoy with this more, we thought what we could do because we've all been attending a lot of workshops and sessions and there is a certain amount of fatigue, we thought we could use this room a little bit and have some questions. I have a few questions for you, so raise your hands based on what you agree with and we're also happy to take some responses and then we could get into the session. I'm Deepti Bharthur from IT for Change, I'm a senior research associate, and work on issues of platform economy, additional citizenship, et cetera.

I'm very happy to be here and I will introduce my panel in just a minute. But before we get started, so let me ‑‑ I have a question to you guys. What is data? So, we have a few options for you guys here, so data is, for those of you who think data is me as data is you, an extension of you, raise your hands, please. Okay. One person, two persons, three persons. Okay.

For those of you that think data is like a Lego block that you use to build other things. You can raise your hand for more than one thing. Not bad, five or six.

What is data ‑‑ I think that might be a bit of a farcical question in this session but no idea, like if you are still struggling from having attending probably the 75th workshop on data governance, if you're still struggling with that question please raise your hands as well. We can figure out the answer together, I can't promise I have it, but like you know. One person. I will also raise my hand, I think.

And those of you that think data is a resource, so social, political, economic, et cetera, please raise your hands.

Okay. All right. So, those were my questions and what we could do, you know, just I'm not necessarily saying that this is something that is going to be very participation heavy, so please don't walk away based on this question and answer session, but you could keep these responses in the back of your mind as we go through the session today and maybe revisit some of your assumptions at the end of it, or you know we hope it reinforces.

All right, so what do you think data governance is? GDPR, how many people think data governance is GDPR? Five or six. Okay. Not bad.

Data governance that is policies that will shape the global economy? Okay. We have some hands as well. Everything? Who believes data governance is literally everything? I was at Rights Con earlier this year and I think the hashtag was digital identity is everything. And that made me pause and think and I thought I don't know if that is true so I thought I'll try the same trick. So, data governance is everything?

Okay. I know many ‑‑ good not to be deterministic, I think, so then who owns data? How many of you here believe that no one, that data cannot be owned, that there is no such thing as data ownership? One half hand. Okay. We seem to be speaking to a room, I guess.

Whoever produces it? One, two, three, four, five, six. Okay. Whoever is the source of data generation, and these may not necessarily be the same thing so that's the reason way, so we have a few hands there as well.

Data is owned by whoever establishes control over it?

>> AUDIENCE MEMBER: True but not ideal.

(laughter)

>> MODERATOR: All right. Now, another teaser, so non‑personal data and personal data and this may not necessarily be a very relevant debate for this year, but it certainly was when people started talking about data governance. Do a lot of people in the audience still believe there is a very significant difference between non‑personal data and personal data? Okay. Good.

All right. Last question for all of you. Raise your hands if you've seen any of these words in any of the workshops that you've attended over the past three or four days. Okay.

Anything that I have missed? I think I've also been attending all the data governance workshops, so if there is anything I've missed, please feel free to shout out something that is not here that you think should go up here and then we could also think about that as well.

>> AUDIENCE MEMBER: IoT.

>> MODERATOR: IoT. Yes, that's a good one. Anybody else? Please?

>> AUDIENCE MEMBER: Net neutrality.

>> MODERATOR: Net neutrality, yes, that's a good one.

>> AUDIENCE MEMBER: (Speaking off mic)

>> MODERATOR: Can you say that again? Metadata. Yes. Very good. Thank you. So we're going to add those as well, so what I'd like you to do as we move on with our session today is to keep some of these ideas and these concepts that you've probably become acquainted with, are already an expert in, barely a static to understand the surface, depending where you are in this debate, as well when we go through the panel and you can come back at the end and think about whether these terms mean the same thing to you when we started off.

So, with that, we have a nice full enough room now and I'd like to open up the panel. We have some very fantastic speakers for you today. We have Jean Queralt who is taking us through the role of technical infrastructure and standards of making data governance more effective. So, he will be speaking to us about ‑‑ and I'm sure a lot of you must have come across the idiom of code is law, and I think that is something that he will at least talk about in terms of what it is that actually makes a governance effective. Some of it is policy side, but some of it is also hard‑coded code and how we make that happen.

And then we have Duncan McCann from the New Economics Foundation talking about the challenge of personalized advertising and online profiling and what policy responses need to be made to us in the context of data governance.

We have Mark a researcher and consultant from Brazil who consults for SMEs on data governance primal consultancy and he will talk about the ownership of data.

And then we've come back around to me and I would like to talk to you a little bit about data governance from a global south perspective and put forward an idea that my organization, IT for Change is working on for quite some time which is that of community data.

So with that, I would like to ask Jean to go first. Panelists will have 12 to 15 minutes to state their points, and my friend who is in the audience will be your timekeeper, so please just look to her. Thank you very much.

>> JEAN QUERALT: Is this one working? Yeah. Hi, everyone. Thanks for coming at 9:30 in the morning on the last day. I think everyone is kind of tired at this point.

Okay. Going back to some of the questions I would be having as it is at the very beginning, I would like to first discuss what is our understanding about data, and I would like to make a point that we have to start to understand that data is actually contextual, and that's one of the main elements of data itself, that we seem to be neglecting most of the time in these conversations.

What do I mean by that? No data is disconnected from anything. And data has the value that we collectively, as most of the time in a collective dilution, agree to give it. Say, for instance, money, we all agree that money is valued because we all agree that it has a value, and so does data. Data represents a specific set or specific value or specific something that we have measured, and we all agree of what does it mean and what is the value of it.

A bird doesn't care about data, and a tree doesn't care about data. Therefore, data is never disconnected by the source where it's generated. It's always connected to the source entity.

What is profiling? Ever thought about what profiling is? I guess that's a buzz word. Everyone knows Facebook, Twitter, what all the big companies are doing. Everyone is family with the term of profiling? Someone who is not? Okay.

Profiling is essentially trying to discover who you are by your interactions with services, and if you take the reference of data being connected to the source, you quickly realize that what they're doing is modeling you. That has some very powerful effects when we try to consider what we want to do with data.

Essentially, profiling is the scanning of the black box that is your head. We try to figure out what are physical features of yourself, what are specifically emotional buttons that may have, so we know how to press them and trigger you into commercial responses or changing your political religions, and we also try to figure out who you interact with so we can try to figure out also where your taste is, what could be your networks, who you talk to, et cetera. That is profiling, and we use that to try to essentially figure out how to manipulate people and how to extract value out of it.

We really need to start understanding as soon as possible that data is not disconnected. There is really a model of everything that surrounds us, that everything comes from a source entity that we recreate a model out of it and we exploit those models, and that has very, very deep influences when we try to discuss about how we're going to be doing data governance and how we're going to have data protection laws that are human rights centered because we're not paying attention to the sources, we're only paying attention to the relationships between the sources.

Why is it not possible at the moment to enforce the typical data protection laws that we have? Well, we are essentially missing the big point of the infrastructure. Let's say, for instance, let's go back to the corporate scenario. What we tend to have is say, for instance, some of you have iPhones, right. You go to iTunes to buy an album. Who thinks that you actually bought the album? If you go to iTunes or Google Play to buy music or a movie, who still believes that you are actually buying the product?

>> AUDIENCE MEMBER: (Speaking off mic)

>> JEAN QUERALT: You're basically buying a license. You're not really buying the product anymore. You're not the owner of the product anymore. Okay.

>> AUDIENCE MEMBER: (Speaking off mic)

>> JEAN QUERALT: Yeah. Yeah. Exactly. What I'm talking about is service providers when it comes to digital assets, so let's say, for instance, you buy that and install it on some of your devices, share with some of your friends, it's living there. Okay. You happen to have bought a very album and decide oh, actually I'm not making enough business with this and I'm going to pull it from the catalog. The moment they decide to do that, what happens? It disappears from your devices, right. It will also disappear from your friend's devices or whoever you shared it with. Why? Because they have the infrastructure to enforce it. They close the loop. They have their advocacy/policy/business model, and they have the infrastructure to enforce it. And there is nothing you can do.

Now, let's turn the table for a minute and look at what happens when it comes to citizen data, governments with their best will, sometimes, try to enact data protection policies and what are those policies telling you? GDPR, PDPR, Malaysia where I live, they basically cover commercial transaction operation, and they basically tell you that you can provide your data to a vendor for as long as you have a commercial relationship with them, and when you decide to stop that, you can ask them to delete your data.

Well, that's licensing. The problem is when I'm requesting them to deliver data, who in this room can give me the technical reassurance that the data is actually deleted? Nobody. Because you are not closing loop. Governments are issuing policies that are supposed to protect the data but they're not implementing the infrastructure to make sure it's by design. In turn what they're making is putting the burden of verification on our shoulders, so I have to be the one knowing if Facebook did it or not, and I have to be the one suing them if I discover that they didn't, and I have to have the money to do it and I have to find a lawyer to do it, et cetera, et cetera, et cetera. We are basically missing the point on that.

So, why is it that we are not doing it right now? Well, we basically haven't had historically speaking, we have had this extremely strange distance between all the products we're having and software, essentially. So, we assume that if I'm giving a car, if I buy a car, the car has already gone through checks and balances to basically comply with all regulations. Right. I can't go to a scrapyard, pick up parts of a car, build it in the garage and go on the street. Why can I do that with software? Why can I claim and then go, good luck finding me later. After I extract all the data I need, money, good luck finding me in Costa Rica.

There is also data which is essentially a verification point. Has anyone ever heard in this room of any syllabus on human rights and digital rights being applied on computer science academia? Just by the fact there is a lot of conversations on ethics of programming, et cetera, et cetera, but by and large, no one, right. How are we expecting programmers, which are the people building this stuff, to have any inclining toward human rights or digital rights, and they have no idea. They are not part of the conversation and we're not inviting them. And beyond that, if, for instance, I'm an architect and I have a very clear set of harms that I can do. I know that if the building collapse, I'm killing people and there is very specific methods for remedy for that, et cetera, et cetera, and we've got checks and balances to make sure that buildings are safe. Right.

Well, typically programmers have no idea where the digital harms are they can cause because we still haven't listed them, there is isn't proper research of that, so how do we want to start having conversations where the technology is the one that we need to fix when the technologies are basically not part of the conversation, they're not even aware that they have to be part of the conversation, and we're not inviting them to be part of the conversation? Very prototypical politically incorrect question of mind is asking in the room of how many tech people are here?

Yeah, look around. Three or four hands. It's cool that we're talking about advocacy, of course, that has to guide us. The implementation comes from the tech people and not the advocates.

So, quick conclusion because I think I'm almost done here in terms of time. First, let's please accept once and for all that data is not disconnected, data sus, and that changes a lot of conversation. Second, we need to look into national Cloud computing infrastructures because if governments are building roads for us, they should also be building the infrastructure to protect our data, and we should definitely start realizing that programmers are the next generation of human rights defenders, and start bringing them into the conversation as soon as possible. Thank you.

(Applause)

>> MODERATOR: Thank you so much, Jean. Over to Duncan.

>> DUNCAN McCANN: Great. So, thank you, if you could bring up the other presentation.

>> MODERATOR: I'm sorry. One second.

>> DUNCAN McCANN: Great. Well Deepti does that, that was a great kind of introduction there from Jean, so I'm going to go into quite some detail on what I kind of consider the beating heart of the data infrastructure. I'm thinking about how the incessant drive toward personalization and the ads that they service and the profiles that these companies are building are, in fact, so central to the governance of data, that unless we look to them in detail and address them, we're never really going to get a data governance framework that really works for people.

So, I'm going to try to ‑‑ they're quite big topics, so I'm going to try to do them within my 15 minutes so let's see how we go.

First on personalization, a quick bit on technical stuff because not everybody knows this. When a visit a page, click on any web page on the Internet it doesn't come pre‑loaded with advertisements, on the time you clicking on the web page and the web page loading, the web page builds a profile of you, sends it ought to an auction network where advertisers bid for your attention and then the winners of those bids get to show you ad, and this all happens within 50 milliseconds of you clicking on that button.

And so this is happening billions and billions of times a day. And so there are some real problems with this model of ad tech. In fact, it's become one of the principal reasons why the Internet has turned into kind of some surveillance infrastructure, because people are desperate to gather every single click, every single page, every single thing you visit, not only in your online world but also collecting it for things in your offline world, to as Jean said, build the most complete profiles of you possible and that's something that we'll look at in my second section.

So, delivering ads has become the central business model of most of the Internet, at least the money of the free services that we really rely on in a day‑to‑day basis. So the bid request system, where you go to a page and it does a calculation. We did a calculation for the UK, they have 60 million people, and this is happening at a rate of 10 billion times a day. This data is being transferred in a way ‑‑ so and these 10 billion bid requests are going to thousands of advertisers, and so this is a huge personal data flow that is going from the profilers who have us into the advertising industry, and it's hugely worrying, and this isn't a market that is hugely concentrated, so almost 90% of the digital ad revenue is spent within just two companies, so there is a Google and Facebook, so it's an extremely concentrated market, it's a market where our personal data is being shared very, very regularly. If you extrapolate that number to the world, you're talking ‑‑ it's happening trillions of times a day, and this is not just data that you would be happy to share. Within the data pacts could be issues of your mental health, political affiliation, sexuality, gender, some things that can be really damaging if they were released.

There is also a real technical problem with ad tech. It's a huge subject of fraud. You have whole bots just clicking on ads, and you have even a huge cycle ‑‑ so a huge amount of money is wasted, and an incredible statistic is 56% of the advertisements are people that paid to the websites that you visit don't ever even get seen by human, and so it's a huge waste of resources as well and the marketing industry is facing a lot of backlash for inflating the figures and inflating the real benefit of it, and we're starting to see the backlash from the European Information Commissioners, so a French data broker recently was find for using this ad tech stream, so this stream of data that was being provided to basically scrape data on almost 50 million people across the EU, and this is from a tiny data broker that nobody ever even heard of. And we've been working very closely in the UK with a number of other organizations with our Information Commissioners, and she ruled earlier in the year, basically, that ad tech convened GDPR. But because it's central toll Internet she wasn't able to move forward and ban it, but so now they're working with the advertising industry to try to find a compliance solution, but this is a really key thing that we're going to have to address.

Indeed, I was even at a talk from an FT columnist who was also in agreement that ultimately, we're going to need with this problem of ad tech and it is just not compliant.

And so what we propose at the New Economics Foundation is to have privacy respecting advertising, so we don't think that advertising should be taken out of the Internet, it's a valid business model that has and pre‑dates the Internet in terms of media and so on, but what we reject is kind of the personal nature of it. We think that you can, and what's whiting is that companies are now proving that this is true, and so many U.S. companies with the invent and implementation of GDPR, themselves decided that the ad tech process contravened GDPR and placed their ads in a kind of contextual way for pages in Europe, and in fact, generate more revenue than going through the ad tech model. The Financial Times recently also changed their model, again, moving away from the ad tech bid request system into a more contextual advertising. We think there is more potential. It would change the dynamics of the surveillance of the Internet and would have some really positive ramifications, so it would start to tackle the data leaks. It's important to remember that Cambridge Analytica had access to the ad stream in the past and it reduced the modification of data, and so data could once again go back to kind of having the public good or at least represent us.

It was force the tech giants to diversify their model, the Facebook and Google would no longer be able to rely on this, and more importantly it would distribute some power back to the websites who have had their power taken away by this overcentralized kind of ad tech model, and you can see that for the New York Times or for the Financial Times places like this are already starting to do it. And an opportunity just to fight back against ad fraud and criminality.

So, with really simple tweaks, we think this would be a really fundamental change of the Internet, fundamental change of the way we think about data, and the way that data is currently used and monetized.

So, in the remaining second half I'm going to deal with the flip side of this ad tech world and building off what Jean said about how we build profiles of ourselves. So, these profiles are building built everywhere, so there are currently thousands of profiles, thousands of Duncan McCann’s online sitting in various states of completeness, some may be just one or two data points, a time when my phone or WiFi masked and something got gathered by a network, and other profiles will be thousands of data points deep and we'll look at some of those in a minute.

These profiles are not just important for ad tech, although that's definitely one of the core drivers, but also as algorithms take on an increasingly large role in our lives, and this is everything from deciding whether you should get a job interview, whether you should go to prison if you are before a judge, and even where I live in London now, so in the Borough of Hackney, if you have a child now there, your child and your surrounding ecosystem, so it would be parents, grandparents, things like that, are run through an algorithm to try and predict the likelihood that that child will be subject to child abuse in the future. If that algorithm comes out with a positive, social services are all over you and checking you and making sure that nothing bad happens.

So these profiles are moving from something that helped us decide the order in which our search results were seen, the way our friend requests were posted, or what recommendations we see on Amazon or things which have limited kind of social impacts, although it's important to note that they still have economic impact, potentially, how we see these results, but they're moving into things which are really, really important for us as people.

So, it's really important that if these algorithms are making vital decisions about us, that they are actually making decisions about us, and this is where we go a little bit into what Jean was talking about, about these models that people are building about us. If we think about when an algorithm makes a judgment about us, and that happens ‑‑ it's going to happen more and more regularly, there are only three ways in which we could be wronged by that judgment. First of all, the algorithm could be designed badly. It could have in it that 1 + 1 = 3 and it would obviously come out with an incorrect decision.

The algorithm could use protected characteristics. It could use things that it's not allowed to use to make its decision, and we've seen some examples of that in the U.S. with housing advertisements not being placed ‑‑ not being placed in front of people of color, things like that, so it could use race or sex as an opportunity to discriminate or a proxy, which obviously algorithms are very good at discovering proxies, so often for race it can use address and post code and things like that.

The third way that we could be damaged is if the algorithm is in fact using incorrect data about us. It's not actually assessing us, but it's assessing an incorrect model of us, and so how do these people build up these models? So, these are just some interesting examples, and so two of the biggest profilers of us in the world are Oracle and Axium. So Oracle has about 3 billion profiles of us, and so that is almost half the people in the world, Oracle has a sellable profile on, and can you see that it is not just the stuff we would automatically expect to have, so really our online data, but they're also absolutely matching this with offline data, and they claim to have upwards of 30,000 attributes on each of those 2 billion people. It's quite unimaginable, and Axium are the same, and they're another company big on data profiles, and again, they assemble all of these different bits of information into profiles of you. I would encourage everybody to go to these websites and do a subject access request, so you can go to them and you can find out what they know about you. It's a fascinating, if scary, exercise. I've had a number of them sent to me, they tend to be about 60 to 70 pages long printed out and it's incredible what they know about you. But almost more incredible what they know about you is what they get wrong. And places like Axium, so this is one of the industry leaders, they state publicly about 30% of the information they have about you is wrong, and yet these are the profiles that are making very, very, very important decisions about us, and as we move forward into the future, are going to be making even more important decisions about us.

And, obviously, what they do is not just take data points, but they also make inferences about us. So, they use whether a they know about us to try to find out more about us, so try to infer interests, propensities, especially around consumption because that's where profiles can be most easily monetized. So, what should we do about this?

It's not always easy to think, because like from an organization like mine, a progressive organization, when we see market failure like this, and this is something that I would really term market failure. There shouldn't be thousands of me out there, and most of it with incorrect data making vitally important decisions about me. Often our response would be that as it is in the UK now where our railways are failing, our postal service is failing, those make sense to bring back into state control.

It's not such an easy decision when thinking about something as important as our data profiles, and indeed when I go out and talk to people in the UK and but also further up field, there is a big reluctance to have this kind of resource at the fingers of the state, and that's something that we've really got to be mindful of..

This is my last slide, and so what we've recommended and are trying to work toward establishing a kind of more specific technical understanding of as well is that the government should fund an independent and decentralized digital identity system, and so this should, one, allow us to prove our identity online without giving us personal data about us. We had a conversation in the UK about online porn and the only solution we had was that you have to provide credit card details and this is a sub‑optimal solution and much better that our new digital identity system that could provide a Crypto graphic key that proved I was over 18 without having to tell them I was Duncan or lived in the UK or what my email address is.

The second thing is that then it would provide a cooperative digital identity system that we build and have it have agency over, rather than the systems built by other companies which are built kind of in opposition to us without us knowing, so we would build up a digital identity online through easy‑to‑use apps and websites. We would have direct control over that data, decide what verified attributes we wanted to include, what inferences are okay for our profiles, and then an independent organization ‑‑ this independent organization would then stipulate how companies but also government agencies and municipalities can tap into that identity system when they need to understand our identity.

And so, we think that this would be a much better way forward for digital identity in the future which will be such an important component of our online world and really, we should be in charge of it and not Oracles and the Axiums of this world. Thank you very much.

(Applause)

>> MODERATOR: Thank you. And I hope all you have are suitably scared as I am. Over to Mark now. You have the floor.

>> MARK: Hello, everyone. Mark speaking. This is a tough act to follow. Very interesting presentations from my colleagues. Thank you. I guess I'm lucky because I want to speak somewhere in between both of these positions because I'm originally a coder, but now a policymaker so at some point those have to intersect, even though it's harder than you might think.

I worked a lot with metadata, but not metadata in a sense that we think about profiling people. I work a lot with metadata for websites, and what that does on websites is it structures the websites. It gives them points for us to understand them, and it's all very compartmentalized. So, you say that this is the page title, this is the date it was published, things like that.

When I think about this, it's a choice that is being made for data to be more structured in the world, for it to be easier to be portable, for it to make more sense across different platforms and across different search engines, for example.

Here we're talking about websites, where do I want to go with this? Why isn't our personal data treated in the same manner? I could say because it's impossible. From a coding perspective right now, it is impossible. Each provider, collector, each aggregator of data right now, really, they aggregate your data in whatever way they want. That's true. They really do. But is that impossible to do it differently?

Could the data be collected in a very specific manner that makes it portable? Therefore, if you get tired of a platform, you can bring it to another platform, for example, instead of having to throw away all of your friends on a certain social network, all of your photos, all of your postings, and letting go of a platform right now means letting go of your history in that platform because it's platform exclusive and that makes a very tough choice to go away from a certain big platform of social media in case, you know, you're a little tired of them or their privacy policies change. That could happen, yes.

So, what I want to bring to our attention is that this is very possible. Our data could be our own. We could fine tune what we want platforms to see, we could fine tune what we want to bring forward, what we want to be deleted. This is just a choice for it to not be that way. From a coding perspective, there is absolutely nothing stopping things from working this way.

There is a website called schema.org which is where the structures for web pages and for all different things exist. If you drop by, you see there is even, you know, categories for insects and all sorts of very interesting things. Why isn't our data being treated in that way? Right?

The reason is very simple. There is no incentive for these platforms to do it this way. There is absolutely no one pushing for this and they are not going to do it themselves. Think about from a private sector perspective, why would you make it easy for your customer to go away? What is the business model there? Here, I'll waste years changing my data collection practices, changing my database structure to make it easy for my customer to go somewhere else and take their business. Makes absolutely zero sense from my perspective. Right.

So that must mean the push has to come from somewhere. They have to compelled to do this or they won't. It doesn't make sense for them, which is where the policymaking comes in. Right. And this is why I want a establish a dialogue from what it being said here, with both Jean and Duncan, that yes, the coding has to be a very, very serious consideration and the policy has to be a very serious consideration because we can only make this kind of idea flourish or make this sort of policy map if we are coming from a point where we are looking to both the code and the policy.

I have a lot of activist friends and I respect their work very much, it's very important. But very often, talking to them, I perceive that they don't really get what's going on behind that, so what happens is that becomes very easy to maybe, for the platforms to give them an answer that looks right, feels right, but from a coding perspective is just about the same. You can do ‑‑ you can lie a lot with code. You can lie all the way back and forth with code. Right. You say oh, I'm not doing this anymore. Yes, what you're doing is maybe something even more insidious, and so to prove my point that it's possible, right, I'm not just saying something theoretical. There is this little point of data that I really like, and it's from privacy international and so 40% of the free apps on the Google Play Store transfer data to Facebook via the Facebook Software Development Kits.

Now, I want to give you one moment to consider this. Okay. Data portability not even a consideration, right, nobody is talking about this. Very few people. And yet, there are platforms that theoretically are competing for your ads, for your data, and they exchange over 40% of the data that goes on in your mobile phone over a free app that's charging us in a payment for a free mobile game, and so it's very possible that if it is in their interest, if it works for them, it's very possible. If it makes business sense for them it is very possible, and so there is nothing that stops this from being a reality, right.

And at that point are they even rivals or are they just working together to collect data? Right. So how do I think we should start approaching this? And I think Jean's product is incredibly relevant in this sense.

Not enough attention is being paid for what goes onioned what is already available, and this deep dive into ads really tells us the things that we need to know. This is the kind of approach that we need to start taking towards data, right. Let's look very deeply into the code, and if we can maybe have a look even deeper at the source of it, because somebody happens to have found a way to get around it, that's even better. It's the mechanism of the data flow. It's something that if activists and data ‑‑ if people want to engage with data from a Civil Society perspective need to do moving forward, I think we're a little past the point where we can talk about this in a very broad manner. This needs to happen, and does this mean a lot of very, you know, boring stuff to look into, but it's kind of where we are because they keep finding new ways to innovate with our data.

So, to me, as we think about this moving forward, policymakers need to start being engaged from both angles so that they really feel pressured. Apparently, they're not reacting enough just from a policy perspective. They find it very important, but the solution is GDPR, which as we know, creates a whole lot different set of problems and then creates reactionary laws all across the world, and just now people are thinking about data localization and data localization is maybe even worse than the data being spread, if the data is spread then it can be super localized and stored in a very specific place, and so things have unintended consequences.

And when we start to think about this moving forward, that's my consideration. Data portability and actually owning our data in a structured way that we can move away if we are tired from a platform, if their privacy is not good for us. I will leave my intervention here and hope we can continue having a conversation. Thank you very much, everyone.

(Applause)

>> MODERATOR: All right. So that leaves me. Can I have my slides back, please. Thank you. So, we had, I think, some really great interventions which I think have been very effectively able to tie sometimes the invisible threats between technology and policy. What I'd like to do with my time today is to go back to the basics a little bit and give you a little bit of a different perspective on what data governance challenges and questions can look like from a global south perspective.

So, I'm just going to ‑‑ I'm sorry. So, I'm starting with a primary assumption here and I'm going to sort of develop my speaking points as we go along, that the concentration of data has been ‑‑ the concentration of data has been together with concentrational wealth and power.

Sorry. One second. Yes.

So, when we talk about this, what do we mean? So, there is an indisputable concentration of power that's happened today that we see everywhere, both in the hands of private players and dominant nation‑states. This is a trend we see across sectors, whether it's energy, transportation, FinTech, manufacturing, health, agriculture, what have you, ports and logistics and IT for Change has done a lot of research on how economic sectors are viewed through data. If you track the mergers and acquisitions over the past year, if you look at any point of BC funding, all of it is essentially on the economic power of data, so that's something that we want to keep in mind.

And when this happens, what essentially is happening is that we economic actors and communities are facing an increasingly insurmountable disadvantage in competing against data powers, and so this is something that we see in the case of workers versus platforms. We didn't have a lot of discussion about labor rights at the IGF this year, but this is something that is really picking up across the world where we have gig workers, whether they're on full delivery platforms, whether they're on Uber and other kind of ride sharing or ride hailing aggregator platform, really coming up and saying that there is an atomization of labor that's happening here. There is a dismantling of what is understood as a traditional employment contract, and we don't treat people as employees, but still they seem to be working longer and harder than everybody else, so that's like a theoretical problem here.

We've noticed in developing nations, especially, small producers and traders versus e‑commerce companies, so IT for Change is working a lot with small trader groups, et cetera, who find themselves at a very difficult stage right now where the invent of Amazon and the domestic unicorn in India which is Flip Cart has really changed the rules of game. You have excessive deep discounting practices, hypertargeting of customers, and all of this is resting on the power of data.

We see farmers, you know, who are being coopted into an agro‑business chain that is completely being reengineered on the basis of data, so if you look at all the mergers and acquisitions that happened over the past two years, Monsanto John deer which acquired Sea and Spray that ended up destroying the very crop it was supposed to protect and you can look it up in a very interesting report that came out by ETC and so you have a lot of AI‑led, data consolidation happening in critical sectors, education, health, agriculture, et cetera, and what we find is that, especially for actors in the global south, not only are they unaware sometimes of the corruption that's happening, but they are increasingly powerless in the face of it as well.

And when we turn a little more to like startups and small firms, we find that most startups now days die or, you know, become quickly bought out for data value proposition than ever before. They don't really stand a chance against transnational tech giants, and it's a geopolitical level, we find the developing nations are also increasingly finding themselves risking policy space, risking economic pathways when it comes to the data race.

So, I just wanted to provoke you a little bit. This is not intended to be against open data and so just keeping that in mind. What I'm trying to suggest when we move toward the idea of ownership of data is that what we understand to be default open has become default privatize and I'll get into why that is.

So, what then we find is this relentless race where every actor in the global supply chain without data advantage can be outcompeted, outbid, outperformed, squeezed, demoted, rejected, or worse, ejected from the global economy.

And this scenario has happened in part due to a regulatory wild west of the digital, which has allowed for the accumulation and subsequent enclosure of data by private sectors. This is data that was and is produced from and by people in social and financial transactions. It's collected through states and statistical survey methods, and more recently, it's being generated via Notes and IoT networks.

It doesn't per se belong to private tech companies and that's the assertion that I make here, but it's captured in ways that grant a no‑questions‑asked cart blanche to them but it doesn't have to be. And in more recent times what we notice and a lot of panelists spoke about the issue of personal data which has become important in the wake of Cambridge Analytica and other such problems of misinformation, hypertargeting, et cetera. It has begun to emerge some kind of rulemaking in this very critical domain, but a lot of ambiguity remains, especially in the global south as states still struggle to rise up to the institutional challenges of developing and implementing data governance frameworks that can serve their goals and be in public interest.

There is a lack of clear norms, there is institutional ‑‑ real institutional capacity deficits and not everything is malicious or misguided and it's not a question of negligence and sometimes a real institutional capacity deficits, and there is also I think hyperoptimism which is unfortunately the weighted governance in the global south with regard to techno solutionism and played a part what I call private by default regime where data is extracted from communities and citizens with little thought to citizens and safeguards benefits and sharing.

I'd like to give an example of why and how this happened. This is even like pre‑‑ even before people started to talk about these very important issues. In 2017 in South Africa, the weather system was severely compromised as a result of the South Africa Social Security Agency, SASSA a poorly framed contract with a private company to manager citizens. The contractor was the task of identifying beneficiaries and doing data cash transfers on different matters, welfare, pension, disability assistance, et cetera, et cetera, but this company was not only able to exploit this database, to make unauthorized debit reductions because it assisted telecom company and what it was able to do very successfully is divert funds from the grants, from the welfare entitlements to cell phone payments to different kinds of FinTech loans, et cetera.

But when this issue got brought up and when SASSA wanted to like accept on the contract after the time had expired, not only did the company say that if you want us to continue working with you, we're going to raise the fees. I mean, they were doing a terrible job what they were doing, but they were saying if you want us to continue doing this terrible job, not only are we going to raise the fees, but if you exit from the contract and if you don't give us ‑‑ and if you don't renew the contract, we will walk away with the database because their assertion was that they had created the database and therefore they had all rights to it, and that is not necessarily true at all, but because at that time they didn't exist a language of data protection for citizen, there didn't exist a due diligence and these kind of norms within public procurement, et cetera, you had this situation where a very critical service, which is of welfare, which makes a very big difference in a nation like South Africa, you know, was literally on the cusp of like this extreme crisis, and the courts intervened and said they're not beneficiaries, and data must be protected, and you know there was some work that was done in this case is still being fought on many different ‑‑ it has many different dimension, so check it out. It's a classic case of like this is exactly the kind of cautionary tale that shows us how quickly data without governance can be absorbed and appropriated, and we've seen this kind of data extraction a lot in developing nation, and it's been called a digital resource grab and may be the original scramble among the colonial powers in the late 19th Century, just like that.

And so why do I then have this slide that talks about cross‑border data flows, you might be wondering, I'm getting to that. In the absence of rules, in the past two years, there has been a concerted push for cross‑border data flows and e‑commerce by global north actors within critical trade agreements. There was a session two days ago on the OSOCA track and a lot of you might have or would have followed the news of, you know, sort of that was passed recently.

A lot of these trade agreements, pushing for cross‑border data flows, risk further shrinking an already small policy space for developing nations who are not only late to the party in terms of having a viable economic pathway, but they stand to lose even the traditional competitive advantage. For example, we know that 3D printing now will infiltrate manufacturing so much that common manufacturers in Viet Nam and Bangladesh seem to be very, very afraid of the fact that, you know, their entire competitive advantage can just be taken away from them.

And so they stand to lose the competitive advantages, and it's not necessarily the point that cross‑border data flows is any question, but that's not the assertion I'm making here, but the idea that when we insist on these and international trade agreements, what we do is foreclose possibilities for nations to arrive at these decisions in ways that serve their best interests.

So, not only is the present at stake but also the future because countries sign away rights in this manner will also be signing a way to chart effective AI strategies, national strategies for data, et cetera, et cetera.

So, you know, I already said this before so I won't get into it a lot, but a lot of my panelists did focus on the issue of personal data. It's also very important to notice that there are a lot of policy focus into personal data protection which is an important vital issue, and aggregate non‑personal data is just as important for the economic corporations. A lot of the data that is produced, for example, let's say survey data about like land mass in a particular country or climate patterns, soil data, et cetera, et cetera, these are not necessarily personal data but they are often produced by communities, they are often like produced through processes of communities, et cetera, and it's very important to look at these as well when we think about what is an effective data governance framework for developing nations.

Lastly, before I present my grand solution, I wanted to just also make the point that voluntary data sharing and I'm sure a lot of you might have gone to workshops that would have pointed this out, but it never almost works. There is no such thing as voluntary data sharing. There is in terms of like actually articulating a public interest data framework, it cannot be on the benevolence of capitalism and that's something that we really need to recognize when we think about data governance frameworks as well. We need to think about how best to ensure that data governance can be in public interest.

So, what should be the starting point of data governance then? In terms of hitting at all of these issues that I mention, the problem of, you know, data like an absence of rules and norms that we are currently struggling with, the threat of foreclosure of data policy space, this kind of like improper contracting, et cetera, that could happen, and the problem of like governing the non‑personal just as importantly as one governs the personal data. Something that a lot of developing nations have been working on, so I think that's something that time can take care of, but this is something that is utmost urgency. I want to say community data. What is community data?

Community data is essentially aggregate data ‑‑ I'm sorry. One second. Yeah. Community data is aggregate de‑identified personal data that is generated from a geographical or interest‑based community or natural phenomenon or attar effects generally associated with a community. Like I said, small‑scale farmers in India have particular farming practices that they've developed over a millennia, you know, particular ways of like seed sharing and et cetera, that's data that's generated by a community. It should belong to them. It should not become enclosed in this like private by default regime by someone like Monsanto and the only way to ensure that, to have community data as a starting point, which in my view and in the view of IT for Change opens up the possibility to hit at all of those other problems, is if we govern community ‑‑ community data as a collective resource and that we try to create the notion of data commons that allows communities where data is produced who have responsibility of data, to not only have equal rights on it but to be able to innovate based on that according to their needs and also share benefits equally. And I'll end here. Thank you so much.

(Applause)

All right. So, we could now open up the floor to questions and comments. We can collect about three. I would request members of the audience to kindly indicate if you have a question to a specific speaker to sort of mention that as well, and if it's a comment, can you also mention that.

Yes? Lady in the back and then. After you.

>> AUDIENCE MEMBER: Hello. Leigh with the WTO. I have two questions, and both of them relate to concerns about competition. My first question is for Duncan. From a competition point of view, I was interested with the brief statement you made that switching from a bid‑oriented ad model to a contextual ad model could lesson some of the effects of dominance of the bid model. I'm wondering how much do you see that effected of sort of loosening the grip of dominance, and do you have any other good examples?

The second question was for the gentlemen in the middle who mentioned portability of your data, like for example with Facebook. Again, I think that's very relevant to trying to reduce the dominance and if you have ‑‑ you don't really have to be an activist to be concerned that customers can move. I mean, in the telecom sector, it became a no brainer that there is competition imbalance when customers can't move from one mobile operator to the other, so just about every country in the world now has or either is having number portability.

Is there a way to make an analogy, a persuasive case to competition authorities that this data portability is pretty much synonymous with number portability because it keeps people from being able to choose the provider they want to choose? Thanks.

>> MODERATOR: Yes. The gentlemen there. Yes. Please go ahead. There are also mics at the end of rows so you could just come up.

>> AUDIENCE MEMBER: Thank you for the presentations. I have two questions. My name is Raymond. I'm a Research Consultant with a regional think tank for the Global South on IT policy, research ICT Africa that is based in South Africa.

The first one is to Mr. Duncan. You talked about the importance of setting up independent digital identity system. But you mentioned government as proposed agency, how then is it possible to ensure the independence of that kind of institutional structure for within architecture in that sense.

And then to the lady, I'm sorry, I didn't get your name. You proposed deidentified community data as your grand solution. I want to know how it is possible to manage the risk of re‑identification based on that. Thank you.

>> MODERATOR: Okay. We can take one more and then we can turn to the panel. Yes, sir, please. In the back.

>> AUDIENCE MEMBER: So, my name is John I'm also from the WTO. I have one question for Jean. You mentioned including human rights into engineer academia curriculum, is it something related to privacy by design, that we have in GDPR, if so can you clarify or make the link?

The question for all of you, it's about anonymization and encryption technologies. No one has mentioned that. Can you give your thoughts about this?

And the third one is for ‑‑ I forgot who said that ‑‑ but it was about the telecommunication requirements and privacy, the fact that if we concentrate data into one place with the telecommunication requirements, the exposure to privacy risk or data breach, for example, could be higher. Can you please come back to this point? Thank you so much.

>> MODERATOR: Okay. So maybe we can start with Jean and then move across.

>> JEAN QUERALT: Yeah. So, when it comes to the question that you just made about whether it was only be about privacy by design. That's only just one piece of it. How many engineers have ever heard about the UDHR? Very little. How many of them ‑‑ how many out of those who have heard about it actually have read it? And which percentage of those have ever gone through an exam to actually prove they understood what it means. Very little of them, so when it really want to try to have conversations where you expect them to have certain attention toward respecting human rights and they don't even know what it, it's going to be very, very complicated.

And this may be a bit tangential but there is also a little bit of what I believe, it's a misunderstanding about agnostic technology. So, what a lot of engineers tell you they want to build technology that is absolutely agnostic. By that, they mean to say that they don't want to do something that will change direction according to the wind blowing in a specific cabinet. Which is fair enough. I mean, we don't want any technology that is specifically politically oriented because of the Pandora's box that it is.

What I feel they fail to realize is that in their decision‑making process, when it comes to design technology, they already making stands and I'm going to give you a very simple example of encryption.

Back in the days, because of long story to explain, encryption was not one of the core elements that we have, for instance, in communication protocols, and what we'll realize is that was actually a very bad idea.

We started presenting that as ‑‑ so let me put a bit of context. Back in the days we started having protocols without encryption and one day realize maybe encryption is a good idea because we're sending plain text in communications that can be intercepted and weaponized against people, and we realize, yeah, not a good thing.

But then what we started to do is to pass the ball to service providers, AKA, I'm hoping gmail is encrypting my email, but we didn't have reassurance on that because it was left to devices of the service provider whether they wanted to decide on encrypting information or not, and that's when they realized, well actually that wasn't a good position either, and we should move down the stack in the communications part, the encryption itself. So, any service that I'm using, by default, is going to be using encryption and there is not a single proposal of any communications protocol now days that doesn't have as a core feature encryption.

I'm sorry, that was us taking a stand. You may not want to say that it's political, but it was still a stand. Technologies do have to make these kind of positions and as soon as possible. They can't remove themselves from that equation. Does that answer the question?

>> MARK: Hello? This is Mark speaking for the record. I'll react to two questions, first about the portability and how it relates to telecom and number portability for mobile devices, for example. I think this brings me back to a problem that I have, which is the mystification of code, and how to am so degree I feel it's intentional. I don't mean necessarily an evil way. It is sort of intentional code. It's seen as this very complex thing, it's interesting it's kept that way, it protects the coding market if it is very difficult. But it's not that difficult. At least the premises of coding are pretty simple and I do believe they should be taught in basic education because, yeah, it's very easy to think of a phone number because we've internalized in our day‑to‑day. But this idea of how the word is made up of these machines and the code underneath them is something that we haven't been thinking about for that much of a long time, and it's exactly the reason why we should be approaching coders and policymakers to maybe understand that it's not that difficult, it's not as magical as these companies would like to make people think because that's where they get their leverage, is by saying that this is so difficult that you couldn't understand.

Can you study this, and this is ‑‑ we use former NASA staff to do this? Sure, you do, it's still a code, it's still possible to study, so that's my point. If people would care enough and try to understand enough, yeah, it's a point that is easy to make. You can go and say when they do ‑‑ when they give those explanation, you can say that that's not true. It's portable. I can prove to you that you can do this, this, and this, and so that would be my personal answer to that.

And about the delocalization issue and why I think it would ‑‑ why it's not ideal. Here I'm speaking from my perspective, right, based on my intervention, because here I'm thinking about the actor that is in control of the data. So, delocalization in a large way, switches away from the private sector towards the government. And when I think about who I want to challenge, would I rather challenge a company or a state? I have a problem thinking that a state would be very interested in what I have to say because a company at the end of the day, if we establish a standard in an international body and it becomes industry practice, they kind of have to follow it. It happened in browsers, and if you remember browsers used to be a wild west. Everybody was doing whatever they wanted, but once a firm standard was established, everybody had to converge to a certain direction to industry practice. States don't act that way. States do whatever they want. We have seen this again and again that they don't care. So, is it better to still have it in the private sector? Yeah, I think so. Like, we discuss a lot about this, about these companies holding our data, but there is still there to tap into and force them to do things and compel them in states, but that's my personal stake and I'm sure somebody else can make the opposite argument. Thank you for the questions.

>> DUNCAN McCANN: Yeah, so I'll just tackle on the question on what kind of effect could contextual advertising have on the sector generally, and I think it would really be substantial, and especially for Google. Facebook is a slightly different beast because actually its ad model works slightly differently to whether a we talked about, but Google at the moment, a lot of the ad revenue comes from owning the realtime bidding kind of middle piece of the puzzle, and so any ad that's being placed now on the Internet, Google is kind of getting ‑‑ or there are two platforms, but the Google ad one is the dominant one. So, Google would still be able to earn ads and in fact the most effective ads in the kind of online space are the ads that appear next to your searches in the Google, and there is a really good reason for that. It doesn't take a genius to figure out what you're thinking about because you've just typed it into a search box, and so obviously, search terms related to what ‑‑ ads related to the search term you just looked for work well and those are the best ads in terms of click‑through rates on the Internet, so Google would still have in ad terms the prime real estate so it wouldn't decimate the revenue but it would stop it from taking a cut of everyone else’s real estate, so when you go to BBC News or Guardian or any other publication, Google is now taking a cut, and this would return it much more to kind of those place, and so I think it would really have a really strong kind of effect in terms of the revenue around marketing, but again, I think for me the other effects about remodifying data and reducing the need to collect it all are almost as important, and I'll certainly let you come back in a second.

In terms of the independence, I think this is really about how you design it, so in the UK we have a number of kind of organizations working for the public which are not government run. The BBC is always a prime example of this. It's set up, even receives annual funding from the government, but the government has no oversight in terms of how it runs or setting the mandate for the BBC.

So, we can look at organizational structures like that as we go and set up and really all government does is sets up the funding and sets up the right framework in which this has to operate. I'll just say a tiny bit about re‑identification because we had a quite robust debate about that in the UK and even got quite close to implementing a law which would have made it illegal to re‑identify people from de‑identified data sources.

In the end, it fell because of questions of legitimate investigations, what happens if the press receives a dataset, reidentifies, identifies a crime, and so it was being unable to kind of resolve these corner case, which actually made the legislation fall, but ultimately I think we need to think about something like that makes reidentification prohibiting with some sort of public interest exemption around it, so yes, I don't know if you want to come back quickly on the point around concentration and then Deepti?

>> Maybe I'm not sure what I mean by contextual advertising, because you just said that you then now go to FT and then so Google used to get a cut of that. So, under contextual advertising, it's now all FT revenue?

>> Yes. The way that. If T has actually switched the model over time so now does 70% of the advertising in a very traditional way, the old‑school institutions used to do it. You say hey, I've got 20 ad slots here and people bid directly to the FT to place their ads there. Not in a kind of one‑off view.

>> AUDIENCE MEMBER: Knowing customers want readers of FT.

>> DUNCAN McCANN: Yes. Going back to old school but using technology to deliver it rather than being in print and having a physical print version.

>> MODERATOR: Yes?

>> Sorry. I just wanted to intervene, that Duncan, you narrowly responded to the question of independence.

>> DUNCAN McCANN: I'm sorry. I thought I had. So, I think it's all about how you kind of set up the institution. So, in the UK, we have a number of organizations that are publicly funded, have a public mission, but are absolutely independent of the government and the prime example is the BBC.

Now, obviously, the BBC has issues outside of necessarily control and the way the governance of the BBC is set up, if it was up to me, I would change it. But I think that what this demonstrates and we have a number of organizations like this in the UK and in Europe, is that you can't have a publicly funded organization with the broad remit of the organization set by government together with Civil Society and the population, which then operates completely independence. So the government cannot tell the BBC what to put, it cannot sensor the BBC, nor does it have special access to the BBC, and so we need to think innovatively, so I don't think the BBC is the gold standard, but it's an easy example that many people are familiar with of how these arrangements can, in fact, start to work.

>> MODERATOR: Okay, so just very quickly on the re‑identification question. It is a very important consideration and I think there are two immediate solutions to that. One, we need to establish no‑go zones. Don't collect data that doesn't have to be collected, and that is something that I think needs to be very well thought out when we think about what should be in community data.

Of course, I didn't mention it, but I do want to say that it's not as if any idea of community data doesn't proceed with the idea that citizen privacy and citizen safeguards are paramount, so I'm just going to leave it at that because we have a lot of people who probably want to ask questions, and so yeah. Please go ahead, and then there the lady in the back and then ‑‑

>> AUDIENCE MEMBER: I can start. My name is Cal. I work for GAZ, a German Development Corporation and I mainly work on trade policy issues at the moment, and of course, we think and it has been proved in the past that trade can be very beneficial if it's done in the right way.

Now, you mentioned, especially I think Deepti, you mentioned policy space and that's also a consideration we are making. We also are convinced that there needs to be regulatory space, and while you also have to make some compromises.

So now we have this whole new field of e‑commerce, data governance, and I'm at least personally still struggling to get a clear idea of what should be our recommendation to a developing country? How should they use this policy space? What can they do domestically on all of those issues that you mentioned? It could also be data portability. My key interest would be e‑commerce related things like competition, what do you do with the platforms, in the situation you're in, you're limited. So how should they use the policy space and what are the tradeoffs in the policy space? What is to be kept at all costs and what are the kind of compromised way of saying there is something to gain that is acceptable? That would be very interesting to me. Thank you.

>> MODERATOR: Yeah. So, the lady and then.

>> AUDIENCE MEMBER: Hi. I'm Jennifer from R and W Media an NGO that builds digital communities around the world and based in the Netherlands. I have a question and in a distant not so past life I worked in public health. And with HIV we did training and awareness raising and this is a question for each much you or none, perhaps, I'm not sure. Do you know of any good projects being developed to raise public awareness, and I mean really the real public, right, everybody who is out there who is not necessarily involved in Internet Governance?

Just a really simple thing, I don't think most people know how this all fits together, and so we've all read a lot about Cambridge Analytica and that created a lot of power which can be useful but I don't know how people make the connection with the advertising example, and are there any interesting projects coming up around that? Thanks.

>> AUDIENCE MEMBER: Hello, faculty at IoT Bombay, center for policy studies, specifically to AI policy related questions, I was hoping to hear more about that from this panel because you can't really talk about data without talking about the intelligence that is extracted from it, so the thing with machine learning is that often you can sort of get away with things, even if you claim that you're protecting data. For example, anionization is something that could be supported with machine learning, but consent is something that would be manufactured. And you could basically get intelligence from what could be called community data and not personal data and so that is one thing that ‑‑ that is one point for the panel to address.

The second point is the question of standards, which was raised by the panelists. The thing is, that from an AI perspective, most AI governance right now is ‑‑ it happens to ethics councils, basically, there is not really much regulatory space out there, and there is really no proof to show that these ethics councils are in any way useful and not completely ‑‑ not mentally platitudes, and more importantly I would go as far as to say that they have been more of a harm, rather than if they would not have existed. Because of their existence, the companies, the data monopolies have been able to get away with saying that we have done certain things where in the material consequences of doing those things haven't really changed the landscape at all and have contributed to, a, functioning at places where AI artifacts are used in ways they're not intended to, and B, collusion between private and state actors where you have private players pushing certain AI technology, those being brought by the state because there was no regulation mandating what can or cannot be run, and C, of course, competitive practices and monopoly, and so these two angle, I would wish the panel to address. Thanks.

>> MODERATOR: Is there a question related to somebody in particular? Okay.

I just wanted to know, are there more people that would like to ask questions? So, if you could raise your hands. Okay. So, there is just one so we can end with that and then we could do the wrap up, that's why I just wanted to know. Please go ahead.

>> AUDIENCE MEMBER: Very quickly to Duncan, do you think we need completely different economic analysis for browsing data and data that's not browsing data? Because of browsing data it might well be true that it's actually a bubble and there is no reason for that large amount to exist in the way that it does, but with other kinds of data, especially for example, mobility data, the monopoly might actually be so efficient that you need to think about ‑‑ it may not be a bubble and you need to think about different interventions there.

>> AUDIENCE MEMBER: Parmenter from IT for Change, and I think we normally have now identified the problem, and the seeking of the solutions either go completely in the private sector zone or in the norm zone or somewhere in between. The question to Mark who was saying that probably data portability is the solution, and that's more or less would address the problem and I would raise this issue that Facebook has come up with a very sophisticated document on very proper data portability, they seriously mean it, they laid out the plan, Google and all are supporting it, and everything is perfect in the plan and I don't see anything missing or as far as the data portability and in this meeting goes, but can you probably have to see why they are happy with that? And I suspect that they can deal with data portability as long as the dominant sector and they know what kind of limited options individuals have and what they use and the laws just to stay away public law, and whether it's really possible to keep on looking for solutions and in the private realm, not going to public law, and then keep figuring out every five years that we did not succeed and think something else and meanwhile the corporate power is entrenching.

>> MODERATOR: All right. So again, I would like to start from Jean. So, if we could keep responses brief because we are 5 minutes to closing time and we'd like to also take like closing comments if possible, so please go ahead.

>> JEAN QUERALT: I don't think there was any specific questions toward me.

>> MODERATOR: A question on the AI?

>> JEAN QUERALT: I will not have an answer on that particular one.

>> MODERATOR: No problem. Mark?

>> Difficult questions. Thank you for them. Mark for the record. I will briefly address the AI issue. The reason why I think these ethics councils are not useful is because they're political in nature. Again, if you look at the choices, at least from my perspective, they're not people who are best suited to integrate them. They're the people who will look good or who will be good compromised or will look good for the board, but they're not being built around goals, they're not driven toward success. They're politically motivated, and many of them have been failing, right.

We see, just this year, three major ethics councils in AI have been dismantled, one from a very big company that I won't name. But we are starting to see some form of deep self‑regulation on that matter and this will bring to the table the deep nude app, which was the app that intelligently undresses women, and that was considered so bad that code repositories wouldn't carry it, it was sort of removed from the Internet by the will of the people involved in AI‑focused projects. They're like this is not where we want to go, and that is a form of self‑regulation with that we might starting more in the future as the field matures. This is one example that I can think of where this actually worked.

And about the data portability plan, it's a very good plan because it doesn't involve the community, doesn't involve standards body, doesn't involve the forum, doesn't involve anything that would make people actually participate, which my question often is, why is there not the equivalent of an ICANN or an IGF for data. If there are so many researchers, so many activists, so many people involved, so much private sector, why not? Why are decisions still being made within a very select circle? So, they must be happy because, otherwise the alternative to that would be for us to come together in a room with a bunch of experts and really dig down into that, and that's very undesirable, right. So sorry for the briefness. I would have more to say, but in the interest of time.

>> DUNCAN McCANN: Yeah, so I'll just touch very quickly on trade policy, I've been looking into that quite a bit and the free trade agreements, and whereas it started with a kind of slightly noble aim to kind of enable e‑commerce globally, I think it sprang into things like source code, free flow of data internationally, even things like authentication by, in effect taking states control over this and taking powers and this is concerning, and for me especially on the third would be really to leave the policy space over for countries rather than try to fit it into a model that at least on cursory analysis looks like it will benefit the dominant players rather than really enable kind of the development that we would like to see.

On raising public awareness, I think it's so important, and I think, you know, it's so easy when you work in here and you can kind of see the impacts that you would automatically expect other people to be in the same place and people really aren't.

But I think that has been a big change. So, when we started talking about some of these things a year ago, it took a lot of effort to get people to where you were. Whereas, now, one, there are more people seeing the dangers of rampant data extraction, profiling, and the advertising that goes along with it. These big events like elections always bring it up, so for instance, one of the things that we're doing in the UK with the Open Rights Group is holding sessions all around the UK where one we're screening the film a Great Hack well worth the watch about the Cambridge scandal and getting people to engage with the political targeting and political analyzing and everybody there, we help access data that political parties hold on us, because not only ‑‑ so some of these profiles are being held in the UK by labor party, they're all building individual profiles of us, and you want to understand what information they have so that we can try to build some kind of awareness around it but it's really hard, it's ultimately technical, people don't really want to know, it's also scary and so people don't ‑‑ and without a clear solution, and so it is a difficult place to raise awareness. But I think it's definitely building on its own momentum and with outside events.

And I think just in terms of what different kinds of data, and so what I think is really important is that whereas our profiles and ad tech is built on a counseled of foundation of browsing, as you'll have seen from the graphics about the profilers, it's absolutely not limited to that. They're connecting that to your digital set‑top box so they're analyzing what you're viewing, linking it with your location data and where you go, they're linking it with other services that you buy, they're linking it with the apps, you know, that so much of the apps are kind of transferring data automatically, and so this is much more than that certainly from a personal perspective, but I would agree that there are other kinds of data that are important for different reasons that we must absolutely get out there and I think cities and especially the platforms that operate in cities, this is absolutely the data that should be made available to the cities, and so it's totally ‑‑ you understand that kind of community aspect of data, and you know places like Barcelona are taking those first steps where now Uber has to hand over some of their data and but ‑‑ I'm sorry, AirBnb has to hand over some of their data and other cities are doing the same negotiation with data with some of the other platforms, so I think it's pretty important.

>> MODERATOR: Okay. I'm going to very quickly respond to the trade question because we're actually over time. So, in terms of how to use the policy space, I think there are two or three ‑‑ I think it entirely depends on where the particular nation is in the digital global value chain.

For example, you said that nations find it limiting, and India doesn't find so very limiting because of many factors. We have a strong ETS sector and large population of tech workers and engineers, and we also have a market power of 600 million users, which makes it comparatively more easy for us to take certain steps in the policy space of trade. That's not necessarily true for other nations, recognizing that I think there are two or three things. One is, I think we need to create policies at the national level and even more so at the local level so that we're not foreclosed at the international level, and so thinking about these issues at both of these levels is very, very important.

And I just want to echo something that, you know, an associate of as an activist who has been working in the issue of agricultural con is education has been saying for a long time that we keep talking about the fourth Industrial Revolution and one thing we must remember is a lot of countries in the global south haven't even gone through the first, and so there is a big leap here about AI and automation and all of these things and how we use data, et cetera, where you still story large populations unconnected, you know, not comprehending but very much co‑opted into the global data regime, and so until we get there, I think it's also maybe good for nations which are not really yet there to even stay out of these kind of trade agreements and strengthen the natural capacity building before even getting into those spaces.

And look at the manifesto of the justice coalition, we just brought a document released at the IGF which looks at some of these issues and it has a set of principles, and I'm sorry, we are out of time and we are actually over time so I want to thank all of our panelists for coming here in the last day of IGF. I don't want to go back to the slides, but I hope we were able to make you revisit or re‑affirm some of the ideas about data governance, so thank you so much, and thank you to my wonderful panelists.

(Applause)

IGF 2019 – Day 4 – Raum II – WS 267 A Tutorial on Public Policy Essentials of Data

Contact information