A few questions regarding CRDT usage from someone that's lightly tested out automerge/autosurgeon in rust late last year, but hasn't used it or any other CRDT for an actual project.
1. Do you use the CRDT document as the source of truth or as just synchronization with a database as the source of truth. If the document is the source of truth, do you keep data in it or copy the data into some other format that's easier to query?
2. How do you handle changes to the schema of the CRDT documents? In my testing I had a `version` field at the top level of the documents and then a function to migrate forward between versions when a document is loaded, but I'm not sure how to handle when different clients are running different versions concurrently as opposed to all clients updating at the same time. I had read some articles that alluded to allowing the previous versions to still change the state and then, seemingly, translate it as needed on newer versions, but they seemed to hand-wave away any details of what that would actually look like to implement.
3. How granular do you go with documents in the spectrum of "one per user" to "one per object"?
> 1. Do you use the CRDT document as the source of truth or as just synchronization with a database as the source of truth. If the document is the source of truth, do you keep data in it or copy the data into some other format that's easier to query?
The DB is the source of truth (DynamoDB, spanned binary chunking) and holds the YJS binary document which is also the source of truth, so I guess the doc? I keep a number of copies of this document in different states because of my aforementioned distrust of computers. 1) Per edit event is recorded as S3 JSON, 2) per edit binary document is kept in dynamoDB, 3) per edit serialized JSON of the document content in S3. This trail of breadcrumbs keeps my anxiety down and has in the past helped recreate documents when "something bad" happened. Something bad was always _my_ poor implementation of YJS causing the document to either grow too big or start throwing warnings - both of which are catastrophic IMO if you're maintaining documents for 3rd parties. The document is kept in sync with a vector state exchange lambda that loads the "latest" document from dynamo and compares the client vector state to the server's "latest" in the DB and responds with a delta. All of this is binary which is a bit unnerving. YJS provides ways to dip into that data stream but it's equally unnerving to unbox the complex CRDT schema in the guts of the lib. When writing data I use dynamo DB's exclusive write mode where conflict writes (based on a version number) will go into a retry mode which at worse costs a few extra ms. This ensures that "latest" never loses an event with concurrent overwrites. I rely on this and the CRDTs communicativity to make sure no writes are ever lost. Since the lib is "local first" all this interaction is transparent to the user.
> 2. How do you handle changes to the schema of the CRDT documents?
I defined a new hypermedia specification, basically HTML as JSON. {id, parentId, type, props, events, acl, childIds}. I make a flat map and build a hierarchy for my editor, then flatten it back out to save it saving the child's Ids to maintain order when building the linked list from the ymap object (hashmap).
This makes it so the core schema never changes, only the type (aka tag) changes and the props/events def from that type. This all is defined in a swagger doc which allows for this type of schema definition. This is reused at runtime for schema validation.
To introduce changes I introduce new types (aka tags) so if I had "type: ThingV1" now I have "type: ThingV2" with a new contract. This also helps with the downstream artifacts from the programs as the devices that get the schema can ignore types they don't implement and use the ones they do, and we can put them both on the same response who's core endpoint should always be v1 because the core schema never changes (thanks W3C for the idea).
> 3. How granular do you go with documents in the spectrum of "one per user" to "one per object"?
It depends on the requirements for the projects, but in all cases the documentId is the partition key for the dynamoDB allowing proper scaling. The content of the document is many per user, sometimes hundreds of documents per user. There is absolutely no query surface for these documents, you can only look them up by direct ID. There is another system that keeps track of the document directory and I would add a query surface of some sort there via event projection if I needed to query my documents - which I fortunately do not.
So far so good! The only thing that is pricy is the breadcrumbs which I'll start tuning to store less as I'm probably coping the same data 5x or more.