Countries with (C/X)DR studies
Countries with (C/X)DR studies
Introduction
People have been asking me why CDR investigations are only carried out in “third world” countries like Chile (??!!), there’s even a bit of a more formal (but limited) study coming to the same conclusions. After seeing much work in the area and knowning this wasn’t right, I asked this Twitter question, and took the Netmob 2010, 2011, 2013, 2015 and 2017 booklets of abstracts that can be found here and ran the following code on them:
Processing
I only worked with the oral presentation booklets (not posters or the D4D challenge). This produced
using a listofcountries.txt
file that I found on the internet. Many
of these lines are false positives though, either because of
mismatches “Mali” -> “Normalized” or other similar effects. I then
spent some time checking out the files by hand, and recording the ones
that were effectively a mention of a specific country’s CDR dataset.
Results
The following table is simply an “existence” table, I’m not exhaustive, but rather would like to show which countries have been studied using these methods at least once.
Continent | Country | Page | Subscribers | Length (mo) | Year | Contributor |
---|---|---|---|---|---|---|
Europe | Andorra | 105 | 1264292 | 1 | 2017 | NetMob |
America (South) | Argentina | 104 | 40000000? | 5 | 2013 | NetMob |
Europe | Austria | 55 | ? | ? | 2013 | NetMob |
Europe | Belgium | source | 2500000 | 6 | 2009 | @leoferres |
America (South) | Brazil | 46 | ? | 0.6 | 2013 | NetMob |
America (South) | Chile | 126 | 142988 | 0.5 | 2017 | NetMob |
Asia (East) | China | 85 | ? | 6 | 2017 | NetMob |
Europe | Estonia | 87 | 48871 | 0.3 | 2015 | NetMob |
Europe | France | source | 48500000 | 5 (2007) | 2018 | @Metti_Hoof |
Europe | Portugal | 72 | ? | 6 | 2013 | NetMob |
Europe | Spain | 72 | ? | 6 | 2013 | NetMob |
America (Central) | Haiti | 74 | 2900000 | 2 | 2015 | NetMob |
Asia (South) | India | 109 | 4000000 | 3 | 2015 | NetMob |
Europe | Ireland | 57 | 500000 | ? | 2011 | NetMob |
Europe | Italy | 56 | ? | 10 | 2010 | NetMob |
America (Central) | Mexico | 42 | 1000000s | 6 | 2015 | NetMob |
Africa | Namibia | 117 | 4500000 | 50 | 2017 | NetMob |
Africa | Senegal | 117 | 9500000 | 12 | 2017 | NetMob |
Asia | Nepal | 58 | 12900000 | ? | 2017 | NetMob |
Europe | Netherlands | 103 | ? | 36 | 2011 | NetMob |
Europe | Norway | 145 | 509 | ? | 2013 | NetMob |
Africa | Rwanda | source | 400000/1500000 | 56 | 2015 | @deaneckles |
Europe | Slovenia | 173 | 5000 | 1 | 2013 | NetMob |
Asia | Sri Lanka | 75 | ? | ? | 2017 | NetMob |
Europe | Switzerland | 118 | 38 | 10 | 2013 | NetMob |
Africa | Tanzania | 146 | 415000 | 4 | 2017 | NetMob |
America (North) | United States | 47 | 475000 | 2 | 2011 | NetMob |
Asia | Bangladesh | source | 5100000 | 3 | 2016 | =@arutherfordium= |
Europe? | England | source | 65000000 | 1 | 2010 | @arutherfordium |
Asia | Pakistan | source | 39000000 | 7 | 2015 | @arutherfordium |
Asia | Turkey | source | 3500000 | ? | 2017 | @arutherfordium |
Europe | Switzerland | source | 2700000 | 12 | 2019 | @ProfDiegoPuga |
America (South) | Colombia | source | 7000000 | 6 | 2018 | @danielapaolotti |
Africa | Cote D’Ivoire | source | 5000000 | 5 | 2012 | @lbravoc |
Conclusions
These are some general conclusions I glean from the table above. They are, alas, not scientific at this point, but anecdotal and would be happy to discuss them. In fact, someone should do a much more in-depth/serious study and let the community know. For now, this should suffice for me so I can just redirect some types of questions I get to this website.
- There’s not really a preference for non-European/developing countries, at least not in this “there is (at least one) dataset for country X” review table,
- the above being said, it does seem that CDR work prioritizes certain countries (Haiti, as the foremost example), but they also seem to do so for humanitarian reasons, instead of less-strict privacy laws (people will do whatever they can to help, including giving out otherwise sensitive information… these are not leaks),
- most of these studies analyze mobile data /from their own contries/ rather than taking data from other countries, except maybe the D4R Challenge and Haiti datasets, which were designed for external help.
Notes
- This is just one conference (albeit the most prominent one, NetMob) and still, not all papers have been included, meaning I’m completely sure that there area many, many other countries/regions that have been studies using C/XDR datasets. [ NB: As more submissions trickle in, I will have to add other sources. ]
- Sometimes, there may be little information about a dataset in a given country, but then it has been studied further in some other paper. I have recorded the page and edition of NetMob with the most information.
- There might also be some points where I’ve missed a piece of information, or even a better dataset from the same region. This should not impact strongly (or logically negatively) on the fact that there exists a dataset for that region.
- This is of course, and by necessity, quick and dirty. Anyone can ask me for pull requests, it’d be fantastic to have a rather complete list of datasets that have been published. I might come back to this running a more exhaustive search in the Netmob pages, or I might not, but one thing that could be done is search for all instances of the word “data” and see if there are other countries that were not picked up by the countries’ restrictive regular expressions (or more likely cities as well).
Acknowledgements
I’d like to thank the following people for their Twitter replies: Esteban Moro, Martha Gonzalez, Jari Saramaki, Nuria Oliver, Erki Saluveer, Yves-Alexander de Montjoye, Alex Rutherford.
Hope it’s useful.