Kushal Das: Tracking my phone’s silent connections

My phone has more friends than me. It talks to more peers (computers) than the number of human beings I talk on an average. In this age of smartphones and mobile apps for A-Z things, we are dependent on these technologies. However, at the same time, we don’t know much of what is going on in the computers equipped with powerful cameras, GPS device, microphone we are carrying all the time. All these apps are talking to their respective servers (or can we call them masters?), but, there is no easy way to track them.

These questions bothered me for a long time: I wanted to see the servers my phone is connecting to, and I want to block those connections as I wish. However, I never managed to work on this. A few weeks ago, I finally sat down to start working to build up a system by reusing already available open source projects and tools to create the system, which will allow me to track what my phone is doing. Maybe not in full details, but, at least shed some light on the network traffic from the phone.

Initial trial

I tried to create a wifi hotspot at home using a Raspberry Pi and then started capturing all the packets from the device using standard tools (dumpcap) and later reading through the logs using Wireshark. This procedure meant that I could only capture when I am connected to the network at home. What about when I am not at home?

Next round

This time I took a bit different approach. I chose algo to create a VPN server. Using WireGuard, it became straightforward to connect my iPhone to the VPN. This process also allows capturing all the traffic from the phone very easily on the VPN server. A few days in the experiment, Kashmir started posting her experiment named Life Without the Tech Giants, where she started blocking all the services from 5 big technology companies. With her help, I contacted Dhruv Mehrotra, who is a technologist behind the story. After talking to him, I felt that I am going in the right direction. He already posted details on how they did the blocking, and you can try that at home 🙂

Looking at the data after 1 week

After capturing the data for the first week, I moved the captured pcap files into my computer. Wrote some Python code to put the data into a SQLite database, enabling me to query the data much faster.

Domain Name System (DNS) data

The Domain Name System (DNS) is a decentralized system which helps to translate the human memory safe domain names (like kushaldas.in) into Internet Protocol (IP) addresses (like 192.168.1.1 ). Computers talk to each other using these IP addresses, we, don’t have to worry to remember so many names. When the developers develop their applications for the phone, they generally use those domain names to specify where the app should connect.

If I plot all the different domains (including any subdomain) which got queried at least 10 times in a week, we see the following graph.

The first thing to notice is how the phone is trying to find servers from Apple, which makes sense as this is an iPhone. I use the mobile Twitter app a lot, so we also see many queries related to Twitter. Lookout is a special mention there, it was suggested to me by my friends who understand these technologies and security better than me. The 3rd position is taken by Google, though sometimes I watch Youtube videos, but, the phone queried for many other Google domains.

There are also many queries to Akamai CDN service, and I could not find any easy way to identify those hosts, the same with Amazon AWS related hosts. If you know any better way, please drop me a note.

You can see a lot of data analytics related companies were also queried. dev.appboy.com is a major one, and thankfully algo already blocked that domain in the DNS level. I don’t know which app is trying to connect to which all servers, I found about a few of the apps in my phone by searching about the client list of the above-mentioned analytics companies. Next, in coming months, I will start blocking those hosts/domains one by one and see which all apps stop working.

Looking at data flow

The number of DNS queries is an easy start, but, next I wanted to learn more about the actual servers my phone is talking to. The paranoid part inside of me was pushing for discovering these servers.

If we put all of the major companies the phone is talking to, we get the following graph.

Apple is leading the chart by taking 44% of all the connections, and the number is 495225 times. Twitter is in the second place, and Edgecastcdn is in the third. My phone talked to Google servers 67344 number of times, which is like 7 times less than the number of times Apple itself.

In the next graph, I removed the big players (including Google and Amazon). Then, I can see that analytics companies like nflxso.net and mparticle.com have 31% of the connections, which is a lot. Most probably I will start with blocking these two first. The 3 other CDN companies, Akamai, Cloudfront, and Cloudflare has 8%, 7%, and 6% respectively. Do I know what all things are these companies tracking? Nope, and that is scary enough that one of my friend commented “It makes me think about throwing my phone in the garbage.”

What about encrypted vs unencrypted traffic? What all protocols are being used? I tried to find the answer for the first question, and the answer looks like the following graph. Maybe the number will come down if I try to refine the query and add other parameters, that is a future task.

What next?

As I said earlier, I am working on creating a set of tools, which then can be deployed on the VPN server, that will provide a user-friendly way to monitor, and block/unblock traffic from their phone. The major part of the work is to make sure that the whole thing is easy to deploy, and can be used by someone with less technical knowledge.

How can you help?

The biggest thing we need is the knowledge of “How to analyze the data we are capturing?”. It is one thing to make reports for personal user, but, trying to help others is an entirely different game altogether. We will, of course, need all sorts of contributions to the project. Before anything else, we will have to join the random code we have, into a proper project structure. Keep following this blog for more updates and details about the project.

Note to self

Do not try to read data after midnight, or else I will again think a local address as some random dynamic address in Bangkok and freak out (thank you reverse-dns).

Planet Python

Kushal Das: When I was sleepy

Back in 2005 I joined my first job, in a software company in Bangalore. It was a backend of a big foreign bank. We trained heavily on different parts of software development during the first few months. At the same time, I had an altercation with the senior manager (about some Java code) who was in charge of the new joinees and their placement within the company. The result? Everyone else got a team but me, and I had to roam around within the office to find an empty seat and wait there till the actual seat owner came back. I managed to spend a lot of days in the cafeteria on the rooftop. But, then they made new rules that one can not sit there either, other than at lunch time.

So, I went asking around, talking to all the different people in the office (there were 500+ folks iirc) if they know any team who would take on a fresher. I tried to throw in words like Linux, open source to better my chances. And then one day, I heard that the research and development team was looking for someone with Linux and PHP skills. I went in to have a chat with the team, and they told me the problem (it was actually on DSpace, a Java based documentation/content repository system), and after looking at my resume decided to give me a desktop for couple of weeks. I managed to solve the problem in next few days, and after a week or so, I was told that I will join the team. There were couple of super senior managers and I was the only kid on that block. Being part of this team allowed me to explore different technologies and programming languages.

I will later write down my experiences in more detail, but for today, I want to focus on one particular incident. The kind of incident, which all system administrators experience at least once in their life (I guess). I got root access to the production server of the DSpace installation within a few weeks. I had a Windows desktop, and used putty to ssh in to the server. As this company was backend of the big bank, except for a few senior managers, no one else had access to Internet on their systems. There were 2 desktops in the kiosk in the ground floor, and one had to stand in a long queue to get a chance to access Internet.

One day I came back from the lunch (a good one), and was feeling a bit sleepy. I had taken down the tomcat server, pushed the changes to the application, and then wanted to start the server up again. Typed the whole path to startup.sh (I don’t remember the actual name, I’m just guessing it was startup.sh) and hit Enter. I was waiting for the long screens of messages this startup script spewed as it started up, but instead, I got back the prompt quickly. I was wondering what went wrong. Then, looking at the monitor very closely, I suddenly realised that I was planning to delete some other file and I had written rm at the beginning of the command prompt, forgotten it, and then typed the path of the startup.sh. Suddenly I felt the place get very hot and stuffy; I started sweating and all blood drained from my face in the next few moments. I was at panic level 9. I was wondering what to do. I thought about the next steps to follow. I still had a small window of time to fix the service. Suddenly I realized that I can get a copy of the script from the Internet (yay, Open Source!). So, I picked up a pad and a pen, ran down to the ground floor, and stood in the queue to get access to a computer with Internet. After getting the seat, I started writing down the whole startup.sh on the pad and double checked it. Ran right back up to my cubicle, feverishly typed in the script, (somehow miraculously without any typo in one go.) As I executed the script, I saw the familiar output, messages scrolling up, screen after joyful screen. And finally as it started up, I sighed a huge sigh of relief. And after the adrenalin levels came down, I wrote an incident report to my management, and later talked about it during a meeting.

From that day on, before doing any kind of destructive operation, I double check the command prompt for any typo. I make sure, that I don’t remove anything randomly and also make sure that I have my backups is place.

Planet Python

Kushal Das: That missing paragraph

In my last blog post, I wrote about a missing paragraph. I did not keep that text anywhere, I just deleted it while reviewing the post. Later Jason asked me in the comments to actually post that paragraph too.

So, I will write about it. 2018 was an amazing year, all told;, good, great, and terrible moments all together. Things were certain highs , and a few really low moments. Some things take time to heal, some moments make a life long impact.

The second part of 2018 went downhill at a pretty alarming rate, personally. Just after coming back from PyCon US 2018, from the end of May to the beginning of December, within 6 months we lost 4 family members. On the night of 30th May, my uncle called, telling me that my dad was admitted to the hospital, and the doctor wanted to talk to me. He told me to come back home as soon as possible. There was a very real chance that I wouldn’t be able to talk to him again. Anwesha and I, managed to reach Durgapur by 9AM and dad passed away within a few hours. From the time of that phone call, my brain suddenly became quite detached, very calm and thinking about next steps. Things to be handled, official documents to be taken care of, what needs to be done next.

I felt a few times that I’dburst into tears, but, the next thing that sprang to mind was that if I started crying, that would affect my mother and rest of the family too. Somehow, I managed not to cry and every time I got emotionally overwhelmed, I started thinking about next logical steps. I actually made sure, I did not talk about the whole incident much, until recently after things settled down. I also spent time in my village and then in Kolkata.

In the next 4 months, there have been 3 more deaths. Every time the news came, I did not show any reaction, but, it hurt.

Our education system is what supposed to help us grow in life. But, I feel it is more likely, that school is just training for the society to work cohesively and to make sure that the machines are well oiled. Nothing prepares us to deal with real life incidents. Moreover, death is a taboo subject with most of us.

Coming back to the effect of these demises, for a moment it created a real panic in my brain. What if I just vanish tomorrow? In my mind, our physical bodies are some amazing complex robots / programs. When one fails, the rest of them try to cope , try to fill in the gaps. But, the nearby endpoints never stay the same. I am working as usual, but, somehow my behavior has changed. I know that I have a long lasting problem with emails, but, that has grown a little out of hand in the last 5 months. I am putting in a lot of extra effort to reply to the emails I actually managed to notice. Before that, I was opening the editor to reply, but my mind blanked, and I could not type anything.

I don’t quite know how to end the post. The lines above are almost like a stream of consciousness in my mind and I don’t even know if they make sense in the order I put them in. But, at the same time, it makes sense to write it down. At the end of the day, we are all human, we make mistakes, we all have emotions, and often times it is okay to let it out.

In a future post, I will surely write another post talking about the changes I am bringing in my life to cope.

Planet Python

Mike Driscoll: PyDev of the Week: Kushal Das

This week we welcome Kushal Das (@kushaldas) as our PyDev of the Week! Kushal is a core developer of the Python programming language and a co-author of PEP 582. You can learn more about Kushal by checking out his blog or his Github profile. Let’s take a few moments to get to know Kushal better!

blog

Can you tell us a little about yourself (hobbies, education, etc):

I am a staff member of Freedom of the Press Foundation. We are a non-profit that protects, defends, and empowers public-interest journalism in the 21st century. We work on encryption tools for journalists and whistleblowers, documentation of attacks on the press, training newsrooms on digital security practices, and advocating for the the public’s right to know.

I am also part of various Free Software projects through out my life. I am a core developer of CPython, and a director of the Python Software Foundation. I am part of the core team of the Tor project. I am a regular contributor to Fedora Project for over a decade now.

I co-ordinate https://dgplug.org along with a large group of friends and fellow contributors in various projects. We spend time together in learning new things and helping out each other on the #dgplug IRC channel on Freenode server. Feel free to visit the channel and say “Hi” to us.

I try to write about the things I learn regularly on my blog.

Why did you start using Python?

I started learning Python at the end of 2005. I wanted to write code for my new Nokia phone and Sirtaj Singh Kang suggested me to start learning Python for the same. While doing so I found that I had to write much less number of lines of code and also it was much easier to understand. I started talking more with the wider Python community over Internet and that hooked me into it more. As Brett Cannon said: “Came for the language, stayed for the community.” is true for many of us.

What other programming languages do you know and which is your favorite?

Through out my programming life, I kept learning a new language in every 8 months to a year. Before I started writing Python, I used to write C/Java/PHP based on what I was working on. Around 2009 I started spending time with functional programming, and loved Lisp a lot. I spent around a year to keep writing more Lisp and was trying to figure out how to use the ideas from there in my daily Python programming life. From 2013 I started writing Go and I do have many projects written in Go.

But, lately I am writing more and more of Rust. I really like the community and also the compiler 🙂

Just in case anyone wants to know how much we love Python in the family, our daughter is named “Py” 🙂

What projects are you working on now?

In my day job, I maintain SecureDrop project along with an amazing team of maintainers and community. SecureDrop is an open source whistleblower submission system that media organizations can install to securely accept documents from anonymous sources. It was originally coded by the late Aaron Swartz and is now managed by Freedom of the Press Foundation.

I am also working on various Python projects which will enable us to have a new Desktop client for the journalists on Qubes OS. Qubes Ansible is another project where I am trying to make sure that we can use Ansible to maintain our Qubes systems.

Which Python libraries are your favorite (core or 3rd party)?

I think I use json module from stdlib and requests module as third party almost everywhere. IIRC my first ever Cpython patch was about adding tests for json module.

In the Python world there are many other amazing libraries which I use regularly, most of them are the product of our amazing community.

What top three things have you learned contributing to open source projects?

  • People are more important than any code.
  • Be nice to everyone.
  • Communication is the key tool for everything in this modern connection world. We have to do a lot more communication over writing than video/audio calls.

Is there anything else you’d like to say?

I would suggest new programmers to look into more number of upstream projects. We need help in various level in all of the projects, so there is a chance to contribute not only by code, but in many different ways.

Last, but least, I would love to mention my wife Anwesha, who is being from a complete different background, helped me to contribute more to the upstream projects and also herself started helping out projects as required.

Thanks for doing the interview, Kushal!

Planet Python

Kushal Das: 2018 blog review

Last year, I made sure that I spend more time in writing, mostly by waking up early before anyone else in the house. The total number of posts was 60, but, that number came down to 32 in 2018. The number of page views were though 88% of 2017.

I managed to wake up early in most of the days, but, I spent that time in reading and experimenting with various tools/projects. SecureDrop, Tor Project, Qubes OS were in top of that list. I am also spending more time with books, though now the big problem is to find space at home to keep those books properly.

I never wrote regularly through out the year. If you see the dates I published, you will find that sometimes I managed to publish regularly for a month and then again vanished for sometime.

There was a whole paragraph here about why I did not write and vanish, but, then I deleted the paragraph before posting.

You can read the last year’s post on the same topic here.

Planet Python