viernes, 19 de octubre de 2012

Sleeping with the enemy

Today I have been at the monthly meeting of the Python Madrid group where I did a keynote about HTML5 and what benefits can offer to the Python web developers, and also showed ShareIt! as a proof of the technology potential. Althought some technical problems with the projector, finally I was able to show it and was a sucess :-)

They liked so much the idea behind of ShareIt! and were very interested on the technology behind it, specially about how I'm using WebSockets for the DataChannel-polyfill and if the final native implementations will support high estress use like the one ShareIt will so or maybe bigger ones (I believe someone is thinking about highly masive videochats... :-D ) but also they asked me a very interesting question that I didn't thought about: the most important building blocks on ShareIt! are annonimity and confidentiality, being the users identified by a random UID at load and with encrypted communications (both WebSockets with TLS and DataChannels by the specification), but since I'm not a security expert, there is an important security hole regarding to annonimity, so it's critical: to be able to do the PeerConnection, you need to transfer your SDP to the other peer to be able to do the connection, but the fact is that it has your public IP on the origin field (just the ID used by DataChannel-polyfill, by the way...), so this way the other end can be able to connect to you... but also know where you are. And it's done both ways. If at one of the ends there is a men in black listening (here at Spain we have La Innombrable, that it's scarier) and send him your shared files list, you have a problem.

They suggested me to use some type of friends white list based on public key for authenticity of the other pair, but since you can connect to anyone to fetch the data in a distributed way and not end having a lot of limited, private networks, this in unfeasable. Anyway, I got the idea banging on my head and think found a solution: just ask for authentification when you query a files list. This way, since for tranfering chunks you look for the files by their hash, you don't know what is being requested except if you have already the file with that hash, so you'll need to ask for all the combinations of the hash (on Tiger TTH, 2^192 combinations, bigger than the ZFS address space), and something similar would happen if you do a fulltext search over the network, so it's impracticable. The only attack point would be ask directly to a specific peer for its files list, and since this is something that you'll not do too much frequently (you want to fetch a file, doesn't matter where he comes), this can be easily filtered only allowing to do it if you have exchange a public key with the other peer (something you should do only if you know who is at the other end...), so you know the request is legit and also you can send the files list data cyphered with that key. Easy and unobstrusive :-)

Another option would be just to remove entirelly the files list request mechanism and only allow searching for files. This would limit the functionality of the protocol, but would be fairly more simple and secure, so maybe it's a good option for a future version...