Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Building a Distributed File Sync in Ruby (sourcerer.io)
77 points by daftpanda on Feb 24, 2018 | hide | past | favorite | 20 comments


I’ve been wanting a robust open source, end-to-end encrypted, Dropbox-like system for awhile.

Bonus points if it has any of the following features:

* certain nodes can store encrypted blobs without being able to decrypt them

* certain nodes can have read but not write access

* certain nodes can have access only to specific subdirectories

* you could set a policy for retaining old versions of files, to use it as a backup system

Is there anything like this out there?

I wonder if IPFS (and eventually Filecoin) would be a good foundation for such a system.


Check out https://syncthing.net/

Written in Go, its lightweight (runs on my beaglebone and pi's) and has been absolutely stable for me for years (100 GB, 10+ devices). It hits all the requirement except encrypted nodes.


I am a happy syncthing user too.

I use Restic https://github.com/restic/restic to cover the encrypted backup use case


Syncthing looks like everything I'd want. But, I cannot consider it because I wont be able to access my data from ios/iPhone. Hopefully someone builds that integration in future...



You could use a combination of syncthing and NextCloud.

NextCloud checks some of your boxes, and is a nice front-end that could run on top of a syncthing synchronized folder.


Cool idea. I will try this combination soon. For someone who has not been following the projects, NextCloud vs Owncloud? Which one should I choose now?


Definitely NextCloud.


There is a client for Android though.


Thanks, I am going to check that out over the weekend!


They won't hit all your needs but consider one or more of:

0. https://librsync.github.io/ (used by dropbox and used in rdiff-backup, duplicity, ,...)

1. https://github.com/axkibe/lsyncd (from a google engineer)

2. https://www.cis.upenn.edu/~bcpierce/unison/ (the B Pierce)

3. Just rsync or rclone[a] in a loop or triggered with incron (http://inotify.aiken.cz/?section=incron&page=about&lang=en)

a. https://rclone.org/


ps. Just found this in the ostree mailing list:

https://mail.gnome.org/archives/ostree-list/2018-February/ms...

>> "Is ostree able to generate some kind of binary file containing the required diffs to get from one commit to another, which can be transferred to / copied on the ostree-host and then applied to the repository (similar to a git patch)?

> This is what `ostree static-delta generate --filename` will do, combined with `ostree static-delta apply-offline`"


There's been Tahoe-LAFS (https://tahoe-lafs.org/trac/tahoe-lafs) for a while, but I found it a bit too difficult to install and maintain.

If I'm not mistaken all but your last points are addressed by it.


There are pre-build packages for OSX, most GNU/Linux distros. I have built it on Windows using the "Microsoft C/C++ compiler for python" that Microsoft makes available for gratis download.

Have you tried Gridsync, a GUI for Tahoe-LAFS?

https://github.com/gridsync/gridsync


I think IPFS is an excellent foundation for such a system. Indeed, a few friends and I are working on it: [1], [2]. It's too early to call it robust yet and it needs an independent security audit, but it has all your bonus features:

* all storage nodes can't decrypt anything

* you can grant read and write access independently to individual files or folders

* you could easily store all previous versions of data, by never unpinning anything in IPFS, or having another server listening for updates and pinning them

[1] https://github.com/Peergos/Peergos

[2] https://peergos.github.io/book


I use resilio sync (formerly bittorrent sync) for this. It provides bonus features (2) and (4), and with some finagling you can also get bonus point (3). It doesn't provide (1).

https://www.resilio.com/

Beware that you have to dodge a lot of premium functionality in the UI, and search a bit to find the secret key functionality. They originally built the app to work decentralized with just secret keys, and no accounts, but then they wanted to build a business out of it, so they started trying to compete with Dropbox, and they made a centralized service to wrap the decentralized functionality. Ignore that part.


git-annex (+assistant) supports 1, 2 (probably requiring some manual setup) and 4.

I've used it across a local NAS, a remote server (both containing the full 2TB contents and syncing automatically), one laptop (manually fetching and pushing files), one Android tablet (automatically pushing photos to the server), and S3 (extra encrypted copy of just a subset).

No problems over the couple of years I've been using it.


You could build an application on top of ipfs as an encrypted store, and use ssb to broadcast your messages to other nodes.


The design seems nice and clean, and the primitives on the back-end are about the right level. I wrote something similar, with golang, as a replicated object-storage system. Although I didn't call it S3 I nearly did:

https://github.com/skx/sos/

Obviously there is a difference between "object store" and "file store", but it isn't so bad. I have a similar replication scheme:

* Each node knows the objects it hosts.

* Each host can poll others to see what a node should have.

* Missing blobs can be synced.


Interesting project! As a Ruby developer I love what you have accomplished with do little code




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: