Project update: di.me, your online personal data manager

By | December 3, 2012

Project update: di.me, your online personal data manager

I am involved in a project that enables you to manage your personal data in a controlled, trustworthy, and intelligent way in the maze of the online identity – a world in which you may wish to maintain non-linked and partially linked unique identities, each representing a different face of yourself.

Project goal
The project develops a di.me platform with a prototype that incorporates user-control at the center of its design: a private server (this could even be your laptop) is your central data node in a decentralized network, and connects you to external services such as Facebook, Google+, LinkedIn, etc. via distinct identities. By learning from your behaviour, it is able to adaptively respond to interactions and provide warning mechanisms when you try to share information that may have unintended consequences – for example, sharing that picture of you at a party with a Facebook group that includes, among other people, your boss.

Current status
Right now, we have a functional demonstration version. On it, you can create multiple personal profiles and identities, and decide which identity to use when interacting online with community functionality like messaging or sharing data. It is capable of detecting what situation you are in based on location data and the people near you, if you choose to enable these services, and use this information to provide relevant warnings and recommendations when you interact with those people through the demonstrational information management platform. Our current version includes both a browser client and an android app – not yet fully functional, but enough to demonstrate the power of the prototype.

We still have some issues to sort out – completing functionality so that key use cases are covered, improving performance, and expanding the proxy layer to include additional nodes and allow for an anonymous, highly secure user experience. We plan to open up for public testing in Spring of 2013. The prototype code, too, will be released as an open-source project, at that point. Of course, you'll be more than welcome to test, at no cost to you.

In case you're wondering about our motivation? Why, nothing other than tax money at work: we're operating under a publically funded project, with a vested objective in putting control of digital identity in the hands of users, in a way in which they can understand, despite the complexity of digital interactions, hidden policy loopholes, and all sorts of other consequences your average online teenager may never have thought of.

Project homepage: http://www.dime-project.eu/en/home/dime/project/contenido.aspx

14 thoughts on “Project update: di.me, your online personal data manager

  1. Sophie Wrobel

    +Robert Pye I can only speculate: help European companies to play catch-up to the US-dominated social network market, and prevent unauthorized data abuse due to rather leniant standards of data protection in particular non-European countries.

    Reply
  2. Robert Pye

    +Sophie Wrobel thanks for sharing. A 'decent' solution to identity has to involve elements of the architecture you describe. I was involved in the UK's identity management scheme where many architectural elements were ill conceived. Identity verses privacy pose some interesting moral and philosophical questions as does the core concept of identity. I'm guessing that the funding has come from the US government? What are their political motivations for the project, do you think?

    Reply
  3. Lyndon NA

    +Sophie Wrobel hmmmm…
    Well, that amy actually be a useful tool for idenifiying usurption/impersonation attempts as well (a problem of ORM).
    So I hope part of the tool set includes a "list" of likely profiles that they can "claim" … and whatever is left they can disown (and get the suggestion that they report to the relevant service provider).

    I'll have a hard think and see what I can come up with as possible weaknesses/flaws (though I admit, I'm impressed so far (not an easy occurance :D)).

    Reply
  4. Sophie Wrobel

    +Lyndon NA Thanks. I can understand your concerns – any solution putting control completely into user hands is not a directly commercially viable model, which in turn means that no company in their right mind would do something like this as their business model. But a public research project isn't allowed to bleed their user base for money, and as such we can actually focus on individual user concerns.

    If you feel that particular concerns should be addressed / emphasized, or would like further explanation, let me know and I can get you in touch with the responsible parties. The security, anonymity and non-linkability aspects of profiles, for example, have been developed primarily by the Security and Privacy Research Group at the University of Siegen. They're extremely compotent in their field, and I'm sure they'll be more than happy to explain their work in depth, or consider collaborating and incorporating new ideas and opinions. There are still have a few months to go before launching into public testing, there's some room to adjust emphasis – so don't hesitate to make suggestions!

    It would also be a good test for the system: di.me has a 'person matching' algorithm in place that makes profile matching suggestions, when it thinks that two profiles could belong to the same person. This eases user analysis during the initial data load. An example, Sophie Wrobel on G+ and Sophie Wrobel on Facebook have the same picture, same name, and live in the same city. So the system might guess that these two individuals are the same person. Checking the linkability on your own profile could be helpful in determining how closely correlated you have your two identities – and saying that you are now in NY when you announced your arrival 30 minutes earlier in Paris would be a sign that your profile may be a 'fake'.

    What it also does is help to keep identities separate. Say you happen to have a Bruce identity on G+, and a Batman identity on G+, and I don't know that, instead thinking that Bruce and Batman are two separate persons. So you'd be able to merge my identities together and share information with me via either of those personas as you see suitable, but I'd receive the information via only one of those personas and never know that you had were the same person behind Batman and Bruce.

    Reply
  5. Lyndon NA

    Thank you very much for the response and details.

    My main concerns are 3rd parties identifying users/associating profiles to individuals etc. and how much control people have.

    This looks like a massive venture and a huge undertaking.
    I wish you the best of luck with it.

    Reply
  6. Sophie Wrobel

    +Lyndon NA Most of those questions are already answered in academic publications – the complete details and proofs of correctness, etc. are buried in the papers and journals listed here: http://www.dime-project.eu/en/home/dime/publications/contenido.aspx

    The short version:

    How do you plan on preventing easy identification of mutually owned profiles?
    Firstly, di.me is a conglomerate of private servers. No user has access to any other server. Secondly, no user can request information owned by a particular profile unless they posess the correct credential for that profile. Technically, the credential is issued by a handshake-protocol, where the user's private server generates a master key based on the data inside that profile. This master key is combined with information on a credential issuer to create a seed from which the private server can generate an access credential for the combination of one of their profiles and the person, or service, being shared with. The credential issuer does not know the identity of the person to whom it issues credentials; it can simply verify that the credential is valid, or that it isn't. This ensures that there is no way to figure out which profile is owned by which person and shared by whom, even if you have access to the database, unless you have also compromised the credential issuer, and have lots of resources to uncrack the encrypted secrets on both sides. Hopefully I haven't lost you there – in short, it will take a whole lot of processing power to crack a single profile!

    Do you intend to anonymise the "source" – where people "are" when they connect/post/interact?
    Yes. There is a proxy layer between the private server and all cloud nodes. The source is further only identified by anonymous data, which can only be traced back to the identity via the identification mechanism mentioned above.

    Will you attempt to falsify base data (such as location/IP etc.) in an attempt to obfuscate such traces?
    No. But we support using Tor to obfuscate traces, as well as turning off location data transmission.

    Will there be a central nexus (solid or cloud) for interactions to pass through and become anonymous if so desired?
    Yes, see above.

    If the above are considered – how do you intend to tackle the legal implications, accountability or service requirements as dictated by social platforms?
    We have defined a multi-dimensional 'policy manager' which does this. Policies are defined on a combination of device, user, social platform, location, provider, and a number of other factors, and overridable by any scope entity 'upwards' in the hierarchy. Particular levels in the hierarchy may be partially defined by the server administrator (this could be you, or a community hoster if you choose to use a cloud node), who should set certain policies – e.g. country-specific legal requirements. For other requirements – e.g. whether or not you can import all your contacts, as a data controller under EU law, we rely on a combination of user input and warning messages. 

    *For the system to be able to identify personal risks etc., it will have to have learned a fair amount about the user.
    How private and secure will that data be?*
    See above responses. For complete control, you should set up your own private server, in which case all of the data is only stored on one place – your server – unless you choose to explicitly make it available via one or more profiles.

    What sort of measures will be taken to prevent hacking?
    (1) Data distribution, (2) Data encryption of both communication channels as well as storage, (3) Distributed credential creation, (4) Source anonymization, (5) Relying on active community-based frameworks (in this case, Spring Security).

    If hacks do occur, what sort of "damage" can be expected (lone node/person account, multiple/mutual accounts of an individual, multiple accounts/all accounts etc.)?
    For a single hacking attempt, the damage would be limited to a lone node / person account. The effort would have to be repeated for each mutual account and for each person, even on a shared system.

    Will there be monitoring of activity?
    In the research trials, anonymized usage information (pages requested, etc) of server logs will take place. Otherwise there is no monitoring taking place other than monitoring set up by the user.

    Will you flag potential hacks, usurption or unusual activity?
    We are required by law to provide a mechanism for users and non-users to flag suspicious activities.

    Will others have access to such data?
    No.

    What about failsafes and uptime of service?
    Data is persisted over service restarts. Service uptime is highly dependent on the server operator – and in a distributed network such as this, this depends on operator to operator.

    If there is an outage, what is likely to occur?
    If you shared information over the di.me protocol without use of an external service, this information will not be available until your server is back up. The web interface and android app will also 'not work' until the outage is over – I suppose you are used to 'fail whales' during outages anyway.

    Will people beable to manually handle things, or are they tied to the system?
    Data export is something that is on the list of things still to do. There is no wish to tie people to the system.

    Reply
  7. Lyndon NA

    Thanks to +John Kellden for the heads up.

    That sounds very interesting +Sophie Wrobel.

    Identity, Privacy, Security, Anonymity, Safety – all of these things are of interest to me.

    The problems I foresee are peoples perceptions of such terms, and peoples objectives.
    With that in mind…

    How do you plan on preventing easy identification of mutually owned profiles?
    Do you intend to anonymise the "source" – where people "are" when they connect/post/interact?
    Will you attempt to falsify base data (such as location/IP etc.) in an attempt to obfuscate such traces?
    Will there be a central nexus (solid or cloud) for interactions to pass through and become anonymous if so desired?
    If the above are considered – how do you intend to tackle the legal implications, accountability or service requirements as dictated by social platforms?

    For the system to be able to identify personal risks etc., it will have to have learned a fair amount about the user.
    How private and secure will that data be?
    What sort of measures will be taken to prevent hacking?
    If hacks do occur, what sort of "damage" can be expected (lone node/person account, multiple/mutual accounts of an individual, multiple accounts/all accounts etc.)?
    Will there be monitoring of activity?
    Will you flag potential hacks, usurption or unusual activity?
    Will others have access to such data?
    If so – whom, and for what purpose?

    What about failsafes and uptime of service?
    If there is an outage, what is likely to occur?
    Will people beable to manually handle things, or are they tied to the system?

    And all of this is purely with the view of a "service", and utterly devoid of the implications of "digital identity" and the myriad ties and complications in that nest 😀

    Reply

Leave a Reply

Your email address will not be published.