IGF 2016 - Day 3 - Room 1 - WS42: How can Privacy help us harness ‘Big Data for Social Good’?

 

The following are the outputs of the real-time captioning taken during the Eleventh Annual Meeting of the Internet Governance Forum (IGF) in Jalisco, Mexico, from 5 to 9 December 2016. Although it is largely accurate, in some cases it may be incomplete or inaccurate due to inaudible passages or transcription errors. It is posted as an aid to understanding the proceedings at the event, but should not be treated as an authoritative record. 

***

>> YIANNIS THEODOROU:   Good morning, everyone.  Thanks for ‑‑

[Speaking Spanish].

Who among them collects more than 4 billion unique subscribers.  This format is very much intended to be an open and inclusive session, but please feel free to interact and ask questions and share your views.  To start off the conversation, we've invited six distinguished speakers to start off we got Mila Romanoff, she's joining us remotely from New York.  And I'm got Miguel on my left, followed by Michel Reveyran de Menthon, and Alexandrine Pirlot de Corbion, and protection of rights and sanctions of Mexico's authority.  And last but not least Boris just introduce I'm sure we all agree that big data involves at least three things.  One it's about combining data for multiple sources.  Analyzing them and hopefully coming up with insights that help inform decisions.  Now what do I mean by goods?  Well, it's about services, whether it's predicting and monitoring epidemics, saving people's lives after disasters.  Relation displacements, urban planning, traffic management and so on.  Now a number are working with a number of big data for initiatives on data that they have access to and so on.  And clearly where personal data is involved they are established and well‑known that everyone has been using risks are mitigated and I'm sure the speakers will touch on this. 

What we would like to focus on today is how do we move forward and move beyond the small scale projects to something bigger and achieve scaleability and accessibility in big data goods and initiatives.  So yes, I guess one of the first question I would like to pose to Mila and Michel is what does big data for social good mean for your organizations and what are some of the success factors that impact people's lives.  So Mila, over to you in New York first.  Thank you. 

>> MILA ROMANOFF:  Good morning, everyone.  I assume everyone can hear me well.  Yes, can you hear me?

>> YIANNIS THEODOROU:   Yes, we can. 

>> MILA ROMANOFF:  Wonderful.  Thank you very much for the kind invitation to speak in such a distinguished panel.  First I would like to introduce myself and a little about the organization that I work for.  Next slide, please?  And I'm a specialist at the United Nations special initiative of the UN secretary general.  Our initiative has been established to understand and explore how big data could be harnessed for public and social good.  And how such data could be harnessed in a protective way.  As you know in 2014, I'm not sure if you can see the slides, but the next slide please, in 2015 member states came together and passed the sustainable development goals.  And sustainable development goals is a set of goals that try to address the most pending issues around the world.  Hunger, health, climate change, gender, equality, peace, justice, security and so forth.  So we're trying top understand how big data sources coming mainly from nontraditional data sources such as mobile phone data, social media data, data collected from postal transactions or financial transactions could be used for understanding how, for example, people are struggling with unemployment or other economic issues or how, for example, mobile phone data could be used to understand and help disaster response.  And in our team we have data engineers, policy specialists and we work with a variety of humanitarian development agencies to try and tackle these issues with the help of big data coming from these various sources.  So if we go to the first slide and go to the next slide, there are a few projects that I'm happy to share with you and mainly because the panel is organized by the GMSMA I thought the most relevant to be to showcase how mobile phone data could be used to understand development. 

So global pulse we work to understand how transactions data could be used to understand traffic patterns.  And in Jakarta, for example, we would with the people calling in the morning, people calling at night could actually show where the most congested areas are.  And this would help the ministry of transportation understand what traditional routes for transportation they should consider in the next year, for example. 

On to the next slide, you will see two maps.  And one map show actually how anonymous detail could be used to understand disease spread.  In a small African country we use anonymous code detail records which were correlated with the direct evidence of where, for example, certain cases of disease spiked and such evidence showed that data at that time aggregated to very high level, could also show how people are moving from one area to another and that actually correlated with how the disease was spread when it was compared to or combined with ‑‑ I'm sorry.  When it was compared to and correlated with the actual disease spread provided by the ministry of health. 

Another project which we've done using the data is actually understand and help and that project was done with the world foot program.  If you see the slide it will be the next slide.  If not I'll explain briefly.  But it's basically anonymized call detail records were used to understand how people are moving before the floods in Mexico and after the floods.  And the transactions were again correlated with the actual floods and the movement patterns provided by the emergency services on the ground.  And it proved that actually anonymous call data records could show how people are moving at the time of or shortly after the disaster, which could provide humanitarians with critical information in real time on how to respond in a case of an emergency. 

Another use of same anonymous call detail records were understanding how many people are calling could also be used in this humanitarian context.  For example, understanding how many people are calling at one point versus how many people are calling at the time of a disaster could alert emergency responders that something is happening.  So one of our projects, for example, we correlated or we compared how people are reacting during Christmas and how much they're calling they're relatives.  Of course there was a huge spike in their calling behavior.  But then, for example, during the Tabasco floods, there was another huge spike which at a regular time wouldn't occur.  So if such emergency preparedness mechanisms were actually implemented, this could save much more time for the humanitarians to understand how people are behaving and that something is happening.  So they could respond in time. 

So of course big data presents many risks and there are many opportunities, but there are also many challenges and the panel dedicated to privacy is the right one.  Because without the privacy protective technique, we cannot really harness big data and we cannot utilize it for public good.  And in many projects where, for example, when you understand the risks and when you understand the harms that could be presented versus the benefits that you, for example, can draw from the analysis of big data, you have to sometimes have to say no. 

So at global pulse we also have a privacy program and if we go to the next slide, it will show you a few activities that we do in the privacy program.  So we do ‑‑ we invent privacy by design in all our projects.  We have privacy principles and guidelines.  And mainly we're also working on innovation tools.  That is to say, in technology‑wise, in understanding quantifying the risks for identification and the right level of aggregation that is needed for the humanitarian and development acts to utilize the data and yet its value, for example, disaster response or for disease spread.  And yet preserve its privacy and not get more data than you actually need.  And we are also developing policy in helping building capacity within the organization on understanding the right principles that apply actually to big data.  And I'm happy to announce just today we published big data guide to help harness data for development.  And as part of the guide we released a checklist or a tool for risk assessment. 

And another just recent development is that we also released their first report on big data for public good for SDGs which was produced based on the discussions of the advisory group that we established to understand and bring in all the experts from around the world on the challenges that exist within big data and privacy.  So I'm happy to answer any questions.  I know I don't have any more time, so I'm happy to answer any more questions in the question section.  Thank you very much for your attention.

>> YIANNIS THEODOROU:   Thanks, Mila.  Let's move on to Miguel and Michel and get questions from the audience in a few minutes.

>> MIGUEL CALDERON LELO DE LARREA:  Thank you, Yiannis.  Good morning, everybody.  I promise you I didn't do a copy‑paste of Mila's presentation but what you're going to see is very much alike. 

So yes, the UN has set social development goals for 2013.  Yes, big data is a huge opportunity to measure progress and shape policies in some of those goals.  Big data can help us to prevent hunger, to promote help, to promote sustainable cities to act against poverty and I'm going to show you some examples that we have done in Mexico with big data.  But before that what we have done, we have a platform ‑‑ we have been able to get different competencies that we didn't have in the past of data warehouse, of data mining.  We also have a lot of things into the research and development.  With this platform we launch next February we basically want to use this engine, this tool we have to develop new projects in big data for social good. 

Let me tell you the example of Mexico in 2009.  Basically we had a big pandemic and the health organizations in Mexico didn't know exactly where ‑‑ okay. 

We're going to talk about the floods in Tabasco and basically these with these information that our provided me, we could identify how the people were moving and then we could tell the government where to put the shelters for example to receive the people and how to act and how to attend the people that was affected by the flu. 

The same information about ‑‑ we have plenty in Mexico so this information how people are moving, where are the biggest disasters and also government we can help them provide that information to assist the people. 

Also we have learned different projects around the world.  In Spain for tourism, for example, providing tourism in Valencia to encourage how to move and how to redirect the traffic of people. 

Basically I think the concern is okay if we want to continue doing this big data for social good, what are the questions, what are the challenges that we have.  First, from a private organization, the first question that our marketing people are saying is can this commercialize opportunities if I provide this for free, can I ‑‑ [ indiscernible ] opportunities.  And if data is my biggest asset why should I be opening it up.  The regulation we're very concerned about data protection, we're really concerned about the privacy of the data of our users and therefore we normally ask people all the time ask the people if we can use their data and why are we using their data and anonymizing that data.  Could this harm the reputation even if it's for social good, if one of these projects some people do not like this project.  And what if the anonymization is not rigorous enough.  Not necessarily this has to be for free.  But we're very open and we're looking at how this is going to evolve. 

Security, we're a member of the opal organization and you think my other colleagues are going to talk about this, but basically this is bringing the algorithm to the data, not taking the data out of our secure [ indiscernible ] and the way to go is with public‑private partnerships and we have been doing public‑private partnerships with organizations in order to develop different programs and different opportunities.  I think I have my 8 minutes already.

>> YIANNIS THEODOROU:   Thank you very much.  Over to Michel. 

>> MICHEL REVEYRAND DE MENTHON:  Thank you.  Thank you very much and thank you for being invited to this workshop.  Of course we are [ indiscernible ] of the first two presentations and in particular with that's why we are participating.  We are acting in the project for inviting ‑‑ [ indiscernible ] in what we call the opal project in fact we have the same approach and the same histories and Telefonica.  We identify for years now that that big data could get a lot of for understanding any kind of social realities and do work on this.  We begin with a working group five years ago now around the idea that development.  It was a very large partnership which in which we participate in particular with a lot of university was a lot of research in scientific projects.  We work on ivory coast as two experimental approach.  And in very close association ‑‑ [ indiscernible ] all institution ministry and so on.  Completely associated to ‑‑ we see what could happen on the many sectors and planning and [ indiscernible ]. 

We work with different university to elaborate more than 100 experimental scientific project with many, many partner that are from ‑‑ [ indiscernible ].  The results has been six project, final [ indiscernible ] always the same kind of ‑‑ the kind of concept.  The ability, in Senegal, we work on statistic to have a better understanding of social providers of the population is virtual, precise identification of the diversity of the situation that we were in before and to follow the movement to.

We get more precise legend before.  We get to understand the evolution and the situation and not only try the single picture.  We clearly identify interest of many locals international university activism.  And I think we avoid all of the difficulties, all of the challenges.  Identify before everything about privacy and the possibility.  At the conclusion we begin to participate in certain around the world for example.  The next concept which try to take stock of all the conclusion of this formal experience, it is research pilot that we try to develop in a large partnership partners on board with us in the first we work with [ indiscernible ] many university in the UK.  Of course Telefonica and we are in a way open to work with more.  We want to see this is approach.  Opal algorithm and presents the project are as follow.  We work not directly with propose access to this algorithm to all institution which could be.

Organized steering committee with partner present before but also partnership also UNFPA trying to enlarge this approach.  And now we arrange is concentrating on the Senegal and we are in a partnership with Telefonica in Colombia.  We have findings both ‑‑ to be able to have an algorithm median [ indiscernible ]. 

In practice all of the institution could be interested in working on data could ask to opal to the consortium so we need such data.  We try to produce the data in this collective algorithm process and we deliver the expect knowledge which could be interesting.  So we moving this project, this pilot projects now we are moving now.  We expect first pilot results next December and we have seen all we could.  Of course we are [ indiscernible ] now there is finance, this kind of project.  Of course if we want to another way large issue we begin in one country with some institution, to produce it and make an open tool for everybody who would like to use this tool for development.  So research pilot project and we hope that in the coming months we get more and more interest, more and more people to be used as a basis for all of the institution, all of the companies who could be interested maybe okay you want to add something?  Thank you.

>> YIANNIS THEODOROU:   Thank you very much, Michel.  I think all speakers gave nice examples bringing to life what they mean by big data for social good and it's certainly refreshing to hear that working together and beyond single solutions to something that involves more and more partners working together.  So it's great. 

Before we move to the next speakers and touching on privacy in greater detail, I just want to pause and see if there are any questions both in the audience or online.  So any contributions?  Yes please? 

>> AUDIENCE:  Good morning, thank you very much.  I had a question for Miguel.  I think you said that you sought consent from customers to be able to use the data in the study that you did and I wonder if you could talk about how you managed to get consent from the customers.  Thank you. 

>> MIGUEL CALDERON LELO DE LARREA:  For example, I said it was a survey.  So basically through messaging we ask the customer if they want to participate in this effort.  And they did.  Basically with that the data of the customers was not revealed.  I realize statistics just to understand where the flu epidemic was, but there was no singularity of saying this person has the flu or not, for example. 

>> AUDIENCE:  Were customers able to opt out if they did not want to have that data as part of that survey? 

>> MIGUEL CALDERON LELO DE LARREA:  Fernando we have what's called the rights so the customer have the right to pull back their data.  This is by law.

>> AUDIENCE:  Just one more question.  How many customers do you have in Mexico? 

>> MIGUEL CALDERON LELO DE LARREA:  We have 26 million customers in Mexico but at that time in 2009 we had maybe less than 15 million.  I mean, we had ‑‑ we have a big sample to understand the problem. 

>> AUDIENCE:  Do you know how many people participate indeed the end, how many participated? 

>> MIGUEL CALDERON LELO DE LARREA:  I can't recall right now but I can give you the information later.  Thank you very much.

>> YIANNIS THEODOROU:   Other questions?  The lady in the back. 

>> AUDIENCE:  Thank you.  My question was about opal project.  I think it's a very interesting one.  Lots of things to happen.  How do you ensure the privacy of users and who do you plan to have access to those datas.  Because with the new technologies today it's also possible to begin to aggregate anonymous information and information is not supposed to be identifiable so how do you ensure that's not happening in the future.

>> YIANNIS THEODOROU:   Is that the question for Miguel or Michel? 

>> Of course it is a problem for everyone in this kind of issue.  At this time we don't explicitly add do you agree with the using of the data.  Data has been used which guarantee absolute confidentiality for individual, for groups and so on.  We work very closely with all of the ‑‑ [ indiscernible ] we work step by step with governments, local authorities, and so on.  To see if it was ‑‑ it could be a real issue.  A technical approach of the data is a technical treatment of the data.  And algorithm collective treatment before was organized on a pure scientific basis and with specific technical approach I'm not at all a technician for this, but with specific technical approach which guarantee that it will not be possible to identify the people and also specific groups, to a specific groups.  The detail we want to interpret some mechanism at one point is it possible to identify this specific village, a specific community eastern if we knew issue in a sense.  Clearly we are in experimental approach.  We are not at all opposing the idea of asking to the people do you accept that your personal data may be used in a collective approach or absolutely ‑‑ but it's a very important part of this experiment of this pilot dimension.  At this time it was technical treatment by scientists which at the end completely guarantee ‑‑ but has not been difficulty and all what has been now clearly I think that everybody considers that what is much more important to the risk could be behind this.  But we're open to any kind of a organization, any kind of management approach to guarantee this at this time. 

>> AUDIENCE:  You talk about risk, do you have any impact assessment about what the risk could be on the misuse of this information?

>> YIANNIS THEODOROU:   I missed the beginning ‑‑

>> I missed the beginning of your question.

>> AUDIENCE:  Yes, I said.  Did you have any impact assessments at the moment about how this information could be misused? 

>> Well, if you want to use it as personal information, certainly at the end it may be possible, I don't know.  But legal, legal organization policies and of course some kind of scientific moral which is behind all of this now the treatment is laboratory of university in particular. 

>> MIGUEL CALDERON LELO DE LARREA:  May I?  In Mexico and in many countries where Telefonica operates there is data protection law and privacy law.  The privacy protection law of the private information and property of a private organizations.  And therefore you have to build a plan and you have to fulfill certain requirements.  Basically we need to provide a regulator in this case, we need to provide on the contingency methods, plans that we have.  We need to protect information.  We need to train our employees how to deal with that private information.  You have the rights, the right for people to authenticate change or take down their data if they want to.  So it's heavily regulated.  So having said that, what we're trying to do here, it's precisely with this program, trying to take it and anonymize those private data and use it for different projects.  And in this case for the social good. 

>> Could we give a promise of the next items on the agenda.  So can we keep the remaining questions for the end, please.  We started talking about these issues, but going to get your thoughts on what are the key ‑‑ when talking about big data for social good and how can we talk about promise in a way that this can be an enabler rather than a barrier to using big data in ways that actually enhance people's lives. 

>> First of all, thank you for having us on this panel.  We've been having this discussion with some of the fellow panelists for a few years now and I'll be pointing out things that remain of concern and things where we're seeing of increasing concern as well.  In the question I was posed, there are two elements.  I'll be treating them separately.  The first one is around the key privacy consideration for big data and social good initiatives and the second part is how privacy can be an enabler in those initiatives.  With regards to the first part of the questions around the key privacy considerations for big data, when we're talking about big data when it comes down to it, it's personal information, the generation use collection and the definition given by Yiannis earlier, the application of analytical techniques, we're having to ask ourselves how can those initiatives be regulated, what is the current trend in the situation where there is little or no regulation already in place.  So there are many countries in the wormed where there is either no data protection or it's not enforced the way it could be.  The starting point where there isn't enough protections and like I was saying, the big data we're thinking about bigger uses of data, other elements coming in around the use of algorithms, the use of publicly accessible information, which we could be arguing for a long time as to what that actually means.  There's a lot of discussion as to what is publicly available information, what can be used and by whom and for what purpose.  So the debate today is still very much about how do we include in the discussions, because obviously we're not against the purposes of the social good initiatives, these are all social problems.  We want to be solving that we need to be solving around poverty reduction, given access to public services to especially vulnerable groups in society.  We cannot side line what the existing concerns are about how these are impacting fundamental rights and also some of the ethical values we have in our societies.  Those canned be undermined or understated in relation to the benefits that are sought by those different initiatives.  So when it comes to ‑‑ there are three points that I wanted to highlight before going to some of the specific data protection privacy concerns.  One I've already touched upon was around the structural regulatory problems, but also when we're talking about the different actors that presented today and that are involved in this initiatives, we have development act agencies, humanitarian agencies, it's unclear how they are implementing the law or are they out parading outside the law, are they putting in place specific regulations to oversee, is it through bilateral agreements.  There's very much a lack of transparency as to how those initiatives are designed and how the principles are implemented within those. 

The second point was around the multiple actors and I know it was promoted at some good an increasing amount of actors are coming into this space but that also means that we're the complexity of the situation is increasing as well because we have the donors who fund the initiatives.  We have the host governments where those initiatives are being deployed on their territories.  We've got the tell communication service providers, we've got the international NGOs who run and implement the programs and increasingly with the use of algorithms in some of the projects presented by orange, we've got the designers, the technologists who are behind the creation, use and implementation of those systems.  And so that all raises questions around accountability, transparency and also avenues of recourse when there are violations of how that data is being used.  These all are questions that are still pending but haven't been answered. 

The last point before going to the specific privacy concern I want to make is the discussions we're having today are very much we need to think about who the beneficiaries are.  A lot of the time, because of the context in which we're having these initiatives, these are vulnerable groups within their own society or people that have been forced to move, they're in transit, migration, so we need to take that into account.  There's an additional level of duty of care to those individuals who are put in that vulnerable position.  We're seeing as well that the individuals in those circumstances are not I am powered to understand the implications of providing their data, and also there's an option as mentioned in a question to opt out.  Can they push back on the demands to give their data when it comes to choosing being safe from floods and being relocated or not.  And what kind of choices are individuals actually given. 

In terms of the data protection privacy principles, I wanted to run through a few and I'm sure my colleague on the left will mention a few as well when it comes to data protection.  Such as mention in a few that I wanted to highlight particularly around consent which was already discussed earlier in the session around how do we ensure that the individual is given the opportunity, the space, the environment to be able to consent freely to provide their information.  Like I said earlier, we've had this discussion for many years now.  But something that has changed recently and the factors might be there are probably one or two.  It's not only people that are communicating.  It's not just the information we provide and that we generate when we communicate, but our devices themselves also generating data.  And this is only information that's appearing and that's becoming more common in the public space recently.  There's two aspects of that.  Either we're generating more data as well because of the way our devices are built and the way we use them, but also there's more data being generated without knowledge even of the ones who create the devices but also of the user.  So that's really important to note as well, to understand that it's not, when you ask for consent and you give your consent, people might assume that it's just the information that they're aware of revealing, when actually there's a whole set of data that they might not be even be aware that they're consenting to various actors using. 

The other one that was already touched upon as well was anonymity.  There was mention of absolute confidentiality in the data sets that were used.  And I'm challenging that a little bit because there's been several studies but reidentification use two to three data sets you're able to re‑identify individuals and linking that to the point I made about who is it we're actually talking about.  These are individuals of societies, groups in society that are put in a vulnerable position, be that they're fleeing conflict or otherwise.  So we really need to take into account that possibility of reidentification if it falls into the wrong hands and those individuals can be identified, we really need to consider what the risk would be to them. 

The other two I wanted to mention was around the quality of the data.  There's very much this buzz in the public discourse around these issue that data will bring in more advantages, more benefits to society.  But we forget that behind all this generation, creation, processing and use of data even in the creation of the algorithms, there's a human being behind that.  And that means that there's room for politicization of the data, misinterpretation of the data, and the data being uses as an authoritative truth when there are concerns around whether the data is actually relevant.  We're seeing in some of the examples even that we're given and I'll happily chat about that, they said groups being used to informed decisions that then will impact a different group whose data was not actually used in the initial generation of that decision‑making process.  Then it raises questions about the results that emerge from the systems or the algorithms used, whether they actually reflect the reality of the targeted group.  So there's around quality of the data is really important to consider as well.  Linked to that as well is around discrimination.  So when you're collecting data around a specific groups, often they're ones you have chosen, for a specific reason and in this context we're talking about social good initiatives, so it could be beneficiaries of those, but we're also talking about specific data of groups based on racial bias, starting points.  And so that's problematic as well. 

There's also the problem of exclusion.  We assume as well that everyone has a digital footprint, that everyone is generating data, when actually again linking back to the groups we might be helping in various settings, especially in humanitarian situation or in post‑conflict context, a lot of people might not be generating that data.  And again there's this issue as to what data is being used to inform and impose and implement policies and decisions that will impact those individuals. 

The last point I wanted to make on the privacies and data protection aspects is more of a generic one around transparency.  There's still very much a lack of information as to who the actors are.  I gave a few examples but I'm sure there are many more examples, there are intermediaries along this process that we might not be aware of.  There's lack of transparency as to how the data is being used.  We're already seeing with the use of various user data, mobile data, for social good that's actually already a secondary purpose.  So how many more purposes come after that.  And then how do you ensure consent and protection and implementation of all the data protection principles that are in place.  So there's an issue of transparency, especially in the context in which we're having these discussions where there's an abuse of power from governments, but also companies how they can use data, how they're using it and who they're sharing it with.  We really need to think about how do we address this issue of transparency. 

Quickly relating to the second part of the question but how privacy can be an enabler of this social good initiatives, the data revolution is there.  It's been there for a long time and it's definitely part of the solution to many things that we're trying to solve.  But it's not the solution.  So it's really important to consider what other elements need to be put in place to be able to fully maximize on the benefits of all these different initiatives. 

And the last point I want to make on that is around evaluation and audit all these social good initiatives.  Are we considering and reevaluating how successful they have been, what were the failures and benefits and how do we assess to ensure that we're not making the same mistakes over and over again, particularly how hard it is to have a data trail.  The data have a life cycle that are very hard to trace.  How do we ensure that that process is transparent so that we can understand what the implications are from the point of collection to the point of multiple uses in the future.

>> YIANNIS THEODOROU:   Thanks just swiftly over to hear the regulator's perspective.  So Fernando, over to you. 

>> FERNANDO SOSA:  Thank you very much and I am very pleased to be here today.  I am going to make my remarks based on the questions posed by attendees.  Firstly, privacy can never be thought of or understood as a hindrance under any circumstances.  In our country, at least in Mexico, privacy is basic right.  It's a right that is necessary for citizens.  And within this rationale, any gathering of information has to guarantee certain minimum standards.  First consent.  Consent is very important because the ‑‑ consent is obtained for general data and sensitive data as well like health data.  So any information that is going to be collected, you have to be asked about it.  So if the data involves health data, it has to be written and it has to be expressed consent for the use of that data.  And there has to be a very specific design of security measures.  And if any of those measures are not met, then it would be prone to privacy breaches. 

And responding to the two questions that were asked, any information that is linked to big data has to be thought of as fulfilling two principles.  The principle of quality and purpose.  The information that you want to gather has to be accurate, correct, and pertinent.  And there is a legal obligation of the user of that information to maintain the information accurate, complete, and relevant at all times.  And the information that is obtained must be collected for a specific purpose and the purpose that it is intended for.  So you are going to ‑‑ if you are going to collect information for a certain purpose, then have you to use it for that purpose and that purpose alone, otherwise we would be breaching the principle of purpose.  And as a regulator, when you start drawing all these minimum requirements for the information, we tend to think of them as hindrances, but I don't believe these are hindrances.  I believe that these are the minimum requirements that are to be met with the information so that they meet all legal purposes and all purposes of social good.  So what is the approach we must have in order to understand ‑‑ to understand privacy as something binding to big data? 

We have to think on the risk management approach.  In our country we issued a law on the use of personal data by private entities.  And based on that law, companies have to adopt self‑governance data management scheme.  And they have to establish risk management profile and define whether their obligations.  Any company that has self‑government scheme can collect data on any topic.  Financial or health or technologies.  And they can redefine their priorities.  Saying what is to be protected with what goal they have to be protected.  And how is going to be protected.  And who is responsible for the protection.  So through the risk management approach, we will control social flows in order to minimize any negative effects that could affect the availability, confidentiality and integrity of the information. 

This means that companies have to change the way in which they make decisions.  And so we'll have to do social organization.  I believe that self‑governance is a way to minimize risks and will also give an idea of more transparency.  That means that companies are making more transparent all the mechanisms that are designed to obtain and distribute and gather all the information.  This is with the purpose of citizens having the better conditions to decide on their information.  And this is my personal belief.  Perhaps the self‑regulation schemes have not grown stronger because we have failed to see the social benefits of sharing the information.  So I would never think of privacy as a hindrance to big data. 

If the information is going to be used, it has to meet the minimum constitutional standards, otherwise it will not be understood as properly used for the big data purposes.

>> YIANNIS THEODOROU:   Boris, can you hear us? 

>> BORIS WOJTAN:  Yes, I can hear you.

>> YIANNIS THEODOROU:   Great.  We can hear you perfectly.  So over to you. 

>> BORIS WOJTAN:  Okay, great.  So I would like to share with you a few thoughts about how big data for social good and privacy can play nicely together.  So I've got three thoughts.  The first thought is that really big data for social good is first and foremost a moral question.  You think the cost of computing and infrastructure have come down so dramatically over the last few years and the expertise in analytics and sophistication of analytics has increased so dramatically that we now have an opportunity that has opened up.  And we have to ask ourselves should we use this data that we have to improve people's lives.  Should we use it to stop suffering and of course the answer is yes.  Of course we should do that.  But that doesn't exist in a vacuum.  There are other moral considerations to take into account.  And privacy is one of those.  So should we protect people's privacy?  Yes.  Of course we should do that.  But my point here is that these two moral considerations don't operate in opposition to each other necessarily.  They can work very well together.  In fact, if we get privacy rights, if we design privacy into our proposals and solutions, if we can get transparency rights, then actually we can make big data for social good.  We can make the outcomes of that better, because we bring people along with us on the journey and there's trust and ecosystem. 

The second thought is that privacy principles are an enabler for big data for social good.  And they're the perfect place to start.  Because I think we'll find ourselves increasingly working across geographies, across cases and across a very complicated legal landscape.  But actually what organizations need is simplicity and scaleability.  So if we start with the privacy principles that lace their way through the privacy laws around the world, you will encounter the principles again and again.  They become familiar to us.  Then if we start with that, then I'll provide a little bit of clarity and simplicity for us.  And yes it doesn't get you 100% compliance in every country, but it's a good place to start to get most of your privacy thinking done.  And of course privacy principles are flexible and dynamic.  Technology mutual and sector mutual.  So it's now up to us as an industry and as NGOs and as governments and policy makers and anyone who's involved in big data for social good to apply those privacy principles in the context of big data.  What we don't need is a rule set of regulations.  What we need to do is take those privacy principles and apply them in these contexts. 

So that brings me to my third thought or section, which is how we at GSMA and the mobile industry are thinking about these issues, and where do we see the application of the privacy principles in the big data for social good context. 

The starting point for me is always going to be the mobile privacy principles which are essentially our version of the privacy principles that you would find generally but translated into the context of the mobile industry.  We agreed those and published those a few years ago and they're general but that is a direction of travel and we think it's a bedrock that we should refer back to. 

A few years ago we had the Ebola crisis and there was a sudden, urgent need to access tel‑co data that governments wanted and various organizations.  And we very quickly had to come up with some guidelines, some sort of parameters that set out how this could take place in a sensical, privacy friendly way.  And we managed to do that quickly, I would like to think because we love privacy principles and our members are very privacy aware.  And then more recently we have published a public policy position on big data and privacy.  Which is generally is quite high level, but it sets out how we are thinking about this and how we are trying to balance the sort of competing demands. 

Right now, there's a few things going on.  GSMA is working with a lot of its members who are already themselves working on big data for social good.  And you've heard some of them speak already today and there are others.  And we're working with them to try and leverage that to really demonstrate how this can work.  And we can overcome these kind of challenges and make them a success. 

And the other thing that's happening at the moment is that we are discussing thinking through and trying to figure out how we take this, these privacy considerations to the next level.  What do we think about that.  And the sort of considerations that we're talking about here or the way we would like to think about it and surmising is that we can enable big data for social good if, and then you come across all these things.  We can enable big data for social good if we understand what is personal data and what isn't personal data.  We at least have to ask ourselves that question.  And where possible provide reports in an aggregate form, de‑identify the data, and there's no one simple solution that will fit all these cases, but there's a point at which we ask ourselves that question. 

We can enable big data for social good if we understand who is responsible and many times in big data initiatives and projects you get multiple parties.  It's all that data sharing and pooling things and need to understand who is really responsible for what data and particularly in Internet of Things age there is a need to consider that. 

And alongside that, governance comes into it.  Because it's all very well thinking about those issues, but you need to have some sort of governance mechanism whether it's at an operator level or if there are multiple stakeholders involved maybe there's some other form of governance.  But these are just considerations, things to think about. 

We can enable big data for social good if we think about the risks and impacts to individuals and groups of individuals as has been mentioned a couple of times already groups, there are consequences potentially.  So we need to have some system for assessing risk and impact.  I think our members do.  And again, another enabling area of privacy would be the transparency point.  If we could get transparency right, then again, that is really powerful.  Okay, everybody accepts that the age of producing a small notice before you collect the data is that is now very difficult to do.  But there are other things we can do.  We can explain these in a general way, that this kind of thing is a possibility.  For circumstances in which it's a possibility and we can explain.  If we do come up with some arrangements for disaster response, available data, then we can explain who has access to that data and for what reason and for what purpose and how that all works.  There is transparency work that we can do to take people with us. 

And then there are as few other things, but to save time I think one of the crucial things for me as well is the ethical dimension.  Because there are ‑‑ as we go down this road, we are going to encounter tricky questions where the law doesn't necessarily give us an answer.  And the privacy principles don't give us an answer.  But we have to weigh difficult issues.  And for that there needs to be some sort of ethical consideration going on.  And I know that some of our members have also already done this individually.  So again, we want to leverage some of the great work that's already all taking place.  And then so just to finish off, my overall message in all this is that big data for social good and privacy are not enemies, they're friends that mutually support each other.  And I think if we get the privacy right, that will enhance big data for social good and I don't see it as an obstacle at all.  As Fernando was saying, it's actually an enabling force.  So I'll hand you back to Yiannis.

>> YIANNIS THEODOROU:   Thanks, Boris.  I think that was a great concluding remark.  Before we sum up, see if there are any more questions from the crowd.  I know Mila wanted to a very quick intervention on a question before.  So Mila, are you still there? 

>> Yes, hi.  I can hear you well.  Thanks very much, Yiannis.  Actually the question was asked in the first session in many comments were already made by the speakers in this second session.  But I just wanted to add that I think and echo what Boris had said and Mr. Sorem, but I think risk management especially in the big data context is crucial.  While we are saying yes we do need privacy laws and we do need regulatory standards, are we already or even if we do have that law, that from the internation perspective, will that be sufficient to cover issues that relate to big data.  If we talk about consent, is consent always effective when we speak about big data or when we speak about emergency situations or develop in our least developed countries.  So I don't think consent or set standards is probably is the only answer.  It's one of, but it's not the only answer to how big data or data in general could help social and public good.  And I think risk management in general should be a very big part of the decision making.  And someone asked in the first session about impact assessments.  I think impact assessments should play a key role in how data is used for social good, especially in situations where there are no clear‑cut answers or where there are no laws in countries where let's say privacy or data protection regulations don't exist.  Or in the context of humanitarian or development response.  And I would say that many international humanitarian development organizations are implementing risk impact assessment or they call it risk, harms and benefits assessment. 

In terms of big data we actually just started exploring and finished as I said first has been published just today or yesterday in understanding how risks, harms and benefits correlate and can be assessed with each other and make a decision whether the project should be taken on in the humanitarian and development context, taken into consideration all the crucial and very uncertain aspects of, for example, big data or questions related to consent.  How efficient is consent, even if we do have the consent, we should always, I think, still assess how informed the consent was and whether even the individual, the beneficiary of the project has given it to you, you still need to assess the circumstances under which such consent has been given, has been taken.  And I know there are many international humanitarian organizations are working, for example, on policies and guidelines that actually include specifically on that aspect. 

Again to summarize my remarks, it is important to include risk management and risk management not only at the beginning of the project, but actually understand the risks and harms and also those that relate to ethical implications of the use of data.  Taken into account the context and engaging if possible representatives of the beneficiaries themselves and the beneficiaries themselves where possible to understand that specific context or that specific culture attitude or that specific societal factor that could influence the outcome of that project for that specific region versus the project in another region.  And the same goes for consent.  Even if the consent has been of change and the project is from the first glance sounds like it's legitimate and lawful and we still need to understand whether it's there and understand the risks and harms associated even if the data is being used with consent.

>> YIANNIS THEODOROU:   Thank you, Mila.  Once again.  So we've got about 7, 8 minutes for any more questions.  I think there was a lady up there who I missed before.  So give you the floor first.  Be brief if possible.  Thank you. 

>> AUDIENCE:  Good morning, everyone.  I would like to ask why this emphasize on communications data.  When you talk about what we're doing with big data, we're talking about CDRs.  It's kind of the main source of information of data that we're possessing or analyzing.  So I think that's we have to think about it because we have on this side of the issue, the metadata but we have the CDF for analysis but on the other hand we have this recognizing metadata as part of communications but you have to consider why this may help governments and another point on that is that we have to think about why data protection may be limited to think about these issues because it's not only about data protection in itself, but privacy in general.  And kind of the where is the information.  ‑‑ the waiver is in this information.  There was mention about the Ebola crisis and the use of metadata.  We have very good report, how ineffective using CDRs in that crisis and other issues, you can look at there.  But it's not that effective.  So we have to think about that. 

And the second part is that we in Colombia we are producing a public policy on big data and we worry about this kind of hype.  We are all in about using big data for everything.  Like big data for health, big data for taxes, I think we need to think about priority ties, what do we need to do with the data.  Because I think this hype is going to leave us in a worse position in terms of privacy.  If we allow this to be for everything, we can think about all the different issues.  So that would be a thing to discuss.

>> YIANNIS THEODOROU:   Thanks for the intervention.  In the back? 

>> AUDIENCE:  Thank you very much and good morning to all.  I am Victor and I work for i sock in Dominican Republic.  My committee is for the representative for the Mexican government.  In Dominican Republic we have issued a new protection act.  What has been the biggest challenge in Mexico to enforce the data protection act?  Because you are 100 times the size of Dominican Republic and I would like to learn from you to see how can we apply your experience to our own data protection act.  Fernando? 

>> FERNANDO SOSA:  Well, as with any other law, in its very early stage the biggest issue is culture.  I understand that any laws on the protection of personal data is a two‑way path.  First, citizens have to understand what is their responsibility in providing the data.  And the party who's going to use the data have to understand how they are going to protect and use them.  And at the pinnacle, we have the authority, if we fail to understand self‑governance and co‑responsibility, then we would not be as successful as we are in Mexico.  In Mexico we have an increased protection of personal data.  We have an increased verification ‑‑ an increased number of verification mechanisms.  We are overseen by many private organizations.  But I believe that the biggest challenge is training and raising awareness amongst citizens and companies of the challenges that we will face.  This is our constant challenge and this is why we have several sets of guidelines and we have conferences and different training and awareness mechanisms.  In our websites we have several guidelines for security, for safe destruction, and for privacy principles.  And everything is a challenge.  But I think that the greatest challenge of all is culture.  Change in the culture. 

>> AUDIENCE:  How are you doing.  I'm from Argentina.  My original question was to Fernando but I think he kind of already answered.  It was what are the limitations that DPAs face when trying to make companies comply with data protection laws regarding big data and how they manage the amount of information they collect.  So I opened the question for the whole panel.  What the role of the DPA should be when trying to make companies comply with these standards. 

>> In Mexico at least we are one of the most regulated industries.  We were the first ones or the first ones to put that self‑regulation plan in front of the regulator.  And there has been particularly in Mexico, for example, we have to keep data for public safety issues and there has been a lot of discussion about what is our participation in those processes, what information do we have to provide the authorities, and how can we safe‑keep that information.  So there is a lot of discussion about data protection in Mexico and interracial between the regulator and ourselves is constant.  I don't know if that answer your question. 

>> More specific on this ‑‑ but I'm seeing it is crucial to accept, respect and apply national regulation everywhere.  Because it is too sensitive issue, it is too sensitive issue for moral issue but also for customers, maybe for competition.  And I think we need to ask a role as possible.  I understand that regulation and legislation probably counts the same because it's complex and it takes techniques and so on and is part in this ‑‑ but we absolutely need very clear roles that all the telecom companies inside of the legislation, they have to respect.  Because telecom company are imbedded in a huge legislation process in every country with regulators and so on.  So the idea that we have in this OPAL experiment is clearly to put everything on the table.  Speak on everything with national authorities, with telecom companies.  I think it is necessary to have clear rules.  I'm not convinced that it is so much more difficult that a legal issue of protection of any kind of data, for example, it is classical we issue you have to protect ‑‑ so you have to accept to communicate information for statistics and any kind of progress. 

>> Two points on that.  Big data was mentioned as well.  We've got public‑private partnerships.  So a lot of data protection frameworks that are in place don't necessarily apply to both of those actors.  So there is's already limitation there.  The second one using discretion to raise this concern, which DPA are we talking about because with the different actors, are we talking about the DPA who has jurisdiction in the country the data is collected, where the decision or whatever initiative is implemented, or where actually the beneficiary is.  So I think it's also these questions about who holds the responsibility and at what point of the process.

>> YIANNIS THEODOROU:  Just one final question.  Let's go there. 

>> AUDIENCE:  My point was more of a general comment than a specific question.  But the point that I wanted to make was ‑‑ I'm from India.  When you talk of ethical considerations with respect to using big data for social good, often the question primarily in privacy and this is sort of this question that I wanted to put forward was would it make more strategic sense to frame those questions in the context of other theories of antidiscrimination law.  The trouble with privacy is most of the data protection frameworks focus more at the stage of collection of data and to put the onus on the individual make informed choices based on privacy notices and other kinds of information provided to them.  Now in the era of big data where there is such indiscriminate collection and often collections in many sort of implicit forms, through IoTs and through smart cities, in that context maybe a legal framework or discussions around the legal framework which moves to focus away from data collection to actual use cases of data and their impact which we could possibly direct through antidiscrimination law or competition law.  Do you guys think it might make sense to move the conversation a little more towards those legal theories? 

>> Some of the issues you've raised.  I don't know if Boris do you want to talk about the ethical considerations points?  Quickly? 

>> BORIS WOJTAN:  Yeah, sure.  Thanks for the question.  I think that highlights very well why we can't just view the privacy considerations in isolation and why we need to remind ourselves of sort of a bigger ethical picture.  Quite how we introduce those questions, I think remains to be developed.  And you know you could develop a privacy structure that has an impact and ethical hook on it and introduce those questions on that or you can do it as you're suggesting to maybe tackle some of those ethical considerations completely outside that.  So I think both those things work.  And Yiannis, if I may just go back to the previous question as well.  I think there's an important message I was hearing at the international privacy commissioners conference in Morocco recently is that and I'm hearing it more and more is that regulators need to be smart nowadays.  Of course their role is to enforce the law in that country, but also they need to prioritize and they need to do that smartly and they need to be selective to be effective.  The best approach of their enforcement and also think about the economy and the economic impacts of their activities as well as just the fundamental rights that go together.  And of course talk to each other as well.  Just throw that in, if I may.

>> YIANNIS THEODOROU:   Thanks, Boris.  So just very briefly sum up the key points during the discussion.  Yes, we agree that there's overall consensus of privacy is fundamental in the big data world, whether it's sort of good or not.  So that's definitely established.  Also see that operators are working on initiatives and they have all been thinking about privacy by design and applying those principles in their own initiatives.  Talk about the options about privacy which is around transporting the algorithm to the data rather than sharing the data itself.  And do the analysis in a controlled environment.  About the human rights aspects and challenges around the big data world it raises.  I think we've also talked about the need for partnerships between different players and touching on the laws as Fernando noted and Boris as well.  There are more than 100 protection privacy laws around the world.  Big challenges, what laws do you comply with and the need to sort of reduce uncertainty and again as Boris mentioned privacy principles are a great step to ensuring that all are playing on the same field.  And I think also touched on regulatory and laws around the world. 

And I believe the last point was around the needs for ethical considerations and as big data raises challenges, for example, how do you apply purpose limitation and other fundamental principles, that's where we have to think about the decisions we're taking when analyzing the data and the possible implications, not just on individuals, also in groups of individuals as many speakers mentioned and what are the risks that are likely to arise if the analysis of data leads to misleading decisions.  So I believe that was a great session and thanks everyone for joining.  So if you could join me for a big thank you to the panelists.  Thanks very much. 

[ Applause ]

[ Session concluded at 10:35 ]