Preparing Data for a Mere Simulation is Harder Than You’d Think

Balancer Simulations is an open-source cadCAD model of a Balancer V1 AMM smart contract. This is useful for exploring “what if” questions involving an AMM pool e.g. “as an arbitrageur, how could I have made the most profit with this particular AMM pool (given historical data)?” or “how high should I set the pool fees to provide a decent ROI to liquidity providers?”. As a fundamental DeFi composing block, AMMs will be increasingly used to encourage certain behaviour – therefore it is important to find the correct parameters.


The simulation is coded to be identical to the original smart contract deployed on Ethereum, so that one can “play” with it and derive some insights. But we still needed a way to feed in user actions to the simulation, reconstructed from what actually happened on Ethereum.

The simulation doesn’t know about transactions or gas fees. Instead, it feeds on user Actions – a user creates the pool; another user joins (deposits tokens into) the pool, providing liquidity; another user swaps token A for token B, paying a fee to the pool for doing so; another user exits, taking out all the tokens he deposited in the pool + a proportion of the pool’s collected fees. These “events”, or what we called “Actions” in the data pulling script, had to be reconstructed from Ethereum log events.

Getting Ethereum log Events

It’s not so easy to get historical data from Ethereum. You need access to an archive node, and an archive node takes at least 4TB of flash storage and a week or two to sync. That’s assuming everything goes well. Plus, when you finally get data out of it, it’s not organized.

Thankfully some people have been putting historical Balancer pool events on a traditional SQL database, which can be queried easily:

select * from blockchain-etl.ethereum_balancer.BFactory_event_LOG_NEW_POOL where pool="0x..."give me all LOG_NEW_POOL events for the pool 0x….. (since a pool is only created once, there can only be one row from this SQL query)
select * from blockchain-etl.ethereum_balancer.BPool_event_LOG_JOIN where contract_address="0x..." order by block_numbergive me all the events where a user added token liquidity to the pool, and sort them by the block in which it happened
select * from blockchain-etl.ethereum_balancer.BPool_event_LOG_SWAP where contract_address="0x..." order by block_numbergive me all the events where a user swapped a token out for another using this pool 0x…, sorted by the block in which it happened

We do this for LOG_NEW_POOL, LOG_SWAP, LOG_JOIN, LOG_EXIT, event_Transfer, fee changes and weight changes (which were too complex to fit into one line of SQL). And afterwards we smush them all together into one sorted list of Events.

“Why didn’t you just get all the Events sorted by block number, you dummy?”
It just wasn’t possible the way the SQL data was organized. Besides, fee changes and weight changes had to be derived, they weren’t actual log Events.

Great, so we don’t need an Ethereum archive node after all, right? Wrong – there is a special anonymous Event emitted in addition to LOG_JOIN, LOG_SWAP, LOG_EXIT that has important information that can affect the simulation accuracy.

Because Ethereum transactions may or may not go through, and tokens in the pool might change in the meantime, you might not get back exactly the amount of tokens you were expecting.

For example, if you swap 1 TOKEN-A for 500 TOKEN-B, you might get 499 or 501 TOKEN-B. Fortunately there are variants of JOIN/SWAP/EXIT methods which let the user decide if he wants to spend exactly this much TOKEN-A, or if getting back exactly 500 TOKEN-B is more important to him.

Unfortunately, this important information was not included in the SQL database, so we needed an Ethereum archive node after all, and fellow developer Raul spent at least 2 nights deciphering this important information from the anonymous Event.

Put all the Events together into a list and group by txhash

events = []
events.extend(turn_events_into_actions(new_events, fees_dict, denorms_results))
events.extend(turn_events_into_actions(join_events, fees_dict, denorms_results))
events.extend(turn_events_into_actions(swap_events, fees_dict, denorms_results))
events.extend(turn_events_into_actions(exit_events, fees_dict, denorms_results))
events.extend(turn_events_into_actions(transfer_events, fees_dict, denorms_results))

events_grouped_by_txhash = {}
for i, action in enumerate(events):
    tx_hash = events[i].tx_hash
    if events_grouped_by_txhash.get(tx_hash) is None:
        events_grouped_by_txhash[tx_hash] = []
# save_pickle(events_grouped_by_txhash, f'{args.pool_address}/events_grouped_by_txhash.pickle')
# events_grouped_by_txhash = load_pickle(f'{args.pool_address}/events_grouped_by_txhash.pickle')


Multiple log Events could actually have been emitted by a single Ethereum transaction. So now, given NEW, JOIN, SWAP, EXIT, and Transfer (pool shares, not the tokens) Events, we want to reconstruct the transactions that were relevant to this particular pool.

The above code simply smushes the events together into a long, unsorted list, and groups them by txhash.

Dirty detail: It says turn_events_into_actions() and it even uses the Action class in data/, but actually they are not yet real Actions, they are still individual Events. That’s because when I wrote the code, I intended to make them Actions, but many other problems came up and I quickly forgot my original intention.

Exception: irregularities in the data caused by 1inch aggregated swaps

We were getting token swaps that didn’t make sense. Which BAL got turned into WBTC, and which into WETH? It is not clear.

"action": {
    "type": "swap",
    "tokens_in": [
            "amount": "447.23532971026",
            "symbol": "BAL"
            "amount": "157.26956649152",
            "symbol": "BAL"
    "tokens_out": [
            "amount": "7279711",
            "symbol": "WBTC"
            "amount": "6.450635831831913964",
            "symbol": "WETH"

As it turns out, this is the work of smart hacks working at to save on gas fees and aggregate swaps into a single transaction. So we had to modify our data parsing script to recognize transactions like these and emit two Actions instead of one.

Turn Events into simulation-relevant Actions

# Remove pool share transfers
grouped_events = list(filter(lambda acts: not (len(acts) == 1 and acts[0].action_type == 'transfer'), grouped_events))

actions = stage3_merge_actions(args.pool_address, grouped_events)

# save_pickle(actions, f"{args.pool_address}/actions.pickle")
# actions = load_pickle(f"{args.pool_address}/actions.pickle")

We remove pool share transfers because they are irrelevant to the simulation (the number of pool shares is a consequence of the inputs, not something we should feed into the simulation).

stage3_merge_actions is where we take Events and merge them into Actions.

  • Yes, it should be called stage3_merge_events_into_actions. Naming is hard.
  • stage3_merge_actions() doesn’t even use the class, which I originally intended to be used here. Oh well.
  • stage3_merge_actions() is also where we ask the Ethereum archive node for the anonymous Event, decipher it and add its data into the “Action”. This should actually belong in section 1, where we get the different Event types from the SQL database, but the code is the way it is.

Interleave hourly prices between the Actions

Since I wrote the part that gets prices from Coingecko API, I’ll explain that here.
As long as you only request 3 months of data, Coingecko gives historical hourly prices. For free. However, this only goes a ways back – you won’t get hourly pricing data for 2018 even if you request 1 day at a time. This inconsistency is conveniently not mentioned on the Coingecko API page.

The hourly pricing data that Coingecko returns is not regular either – you might get

2021-01-01 00:47:33 1000
2021-01-01 01:33:21 1001
2021-01-01 02:17:05 999

which is worrysome. Now I have to round the timestamps to the nearest hour, and 01:33:21 rounds to 02:00:00, but 02:17:05 also rounds to 02:00:00! So I’ll have to throw something away.

Then again, who’s to say other pricing data services like Tradingview aren’t doing this in the background either?

Lesson Learned

Franz Kafka was known to find faults in anything that came from his own pen, except for a chosen few, amongst them The Judgment, which he supposedly wrote in a single 8 hour cohesive sitting, and was the first short story he was truly proud of.

That was also how I wrote the data ingestion script for the simulation. It was a beautiful, cohesive solution that fit very well to the problem.

But the problem changed. The 1inch aggregated swap problem came up. Prices had to be added. The archive node had to be queried for additional information, and nobody had time to rewrite everything. Over time, it became a jumbled mess, far from the elegant solution I had envisioned.

Kafka knew what he wanted to express, and it stayed the same. But as a programmer, we think we know the problem, but we don’t. It changes, or time will show us our understanding was wrong, or incomplete. Hence:

As a programmer, prefer the flexible solution, not the most beautiful/elegant solution.

me, who else

cadCAD can’t simulate humans

Now and then, when people hear I code economic simulations with cadCAD (and I just heard about its rust cousin radCAD), they want me to write one for them. And it usually involves asking how humans would behave in a specific situation.

here’s the thing

I don’t know.

I can’t simulate human psychology.

What would people do in this or that case? This is not what a simulation does. A simulation says “given these conditions, and behaviours, this will happen (most/some of the time)”. One must be very clear about what these behaviours are and formulate the question in such a way that we can answer it without having to program mini-humans. Mini-humans are impossible to verify anyway.

Bad question: “What would humans do if I raised taxes?”
Better question: “How many people would stay in my system given a group of humans (with income distribution A, differing tolerances for tax raises B) and if I raised taxes above a certain threshold?”

You get the idea.

Math helps to set your intentions in stone. If you can express every interaction and change as an equation, everyone can verify that the simulation is working as intended. After all, if I write a complex system that I claim simulates how humans would behave, how is anybody going to verify that? I was probably just guessing; and it’s hard to verify if the code does what I intended. It’s far easier to test if an equation computed the right result.

A Blockchain Use Case for Dancers

Hellen the dancer

wants to make a name for herself, so she uploads a video to Youtube, err, I mean, Odysee/, which are just the same thing in the background, the LBRY blockchain.

Uploading the video and reserving the channel name @ZoukBerlin costs some LBRY Credits (LBC).

The end result is that the website doesn’t have to show ads, annoying everyone, to sustain itself. It earns from the very act of people uploading and curating videos.

People can choose to tip her in LBC, or support her video (again with LBC) which helps it move up in the search results.

Imagine if a normal company did this with their own credits system. They’d make it impossible for people to cash out their custom credits, locking people in! Those travel companies are doing it right now as we speak.

As people tip her LBC, Hellen could use those to boost her videos, or cash out by converting them to Bitcoin, Ethereum, or anything else. It’s a balancing act.

The Video In Question

Youtube is famous for demonetizing videos for whatever reason. Since Hellen doesn’t usually dance to silence, inevitably some record company will step up and say:

Using our song in your video is a privilege. All proceeds belong to us now, thanks!

Hellen doesn’t think so, naturally. Anselmo Ralph should be honoured that she wanted to dance to his song! Indeed, if you asked the artist himself, he might even say:

Hey, thank you for dancing to my song! (nice ass)

Suffice to say this doesn’t happen on blockchain platforms. If LBRY ever considers demonetizing videos based on their audio, as a LBC token holder, Hellen and many other content creators can even vote on the issue.

Do You Need a Blockchain? Think Again.

So went the memes a few years ago.

But they said that about computers before.

“I think there is a world market for about five computers.”

IBM’s President Thomas J. Watson, early 1940s

Hell, I said that about smartphones too when I first heard of them.

But there are 2 good use cases for a blockchain and 1 great one I thought of myself, hear me out.

An uncorruptible, un-biasable Being

Governments are made of humans, so they can be put under pressure by, let’s say, rich groups of companies. Just like humans, they usually cave in under pressure.

But Bitcoin doesn’t cave in. It doesn’t say “whoops election day is coming up so let’s approve a bailout/stimulus package (aka print more money)”. It doesn’t even need money – it’s its own money! It only needs people agreeing to participate in it.

Well, okay, both Bitcoin and governments print their own money. The only difference is that Bitcoin’s printing is determined by the program that says there will only ever be 21 million Bitcoin, while governments’ printing is determined by humans.

But wait a minute, you say. Humans wrote the program, so how is it going to be any better?

Well, let’s say you want to change the rules to benefit you. Which is harder – convincing everybody who’s already running the Bitcoin software to run your new economic policy Bitcoin, or convincing a few government officials to push your policy?

I thought so.

Distributing power even more than the current stock market allows

I have news for you. None of these startups with their friendly pastel coloured ad campaigns and sans-serif fonts are on your side.

Think of how Uber is organized. At the top you have C-level executives (who own a large part of company stock), and then somewhere below them the programmers (who mostly don’t own any stock), and finally, at the very bottom, the Uber drivers (who most definitely don’t have any stock).

The Uber drivers have to do what the executives want them to do. They don’t have a choice, and they suffer as a result . After all, they’re not stockholders.

But what if in order to have anything to do with Uber, you had to own stock, even a tiny minuscule bit, and that stock came with voting rights? At the very least, drivers would have a way to push back instead of just leaving.

Bitcoin, Ethereum, any of these cryptocurrency tokens are just like a stock – except you don’t have to ask your bank to handle them for you. You can deal with them yourself by going to a website and buying them. That’s the key: it makes owning the token more direct and thereby distributes ownership/power amongst more people.

Escape a bad economy

I’m the most proud of this one, because I thought of it myself.

Think of a healthy, smart, hardworking person in a poor country (let’s call him George). No matter how healthy, smart, or hardworking he is, he’s still poor compared to the average American/European! Why? because he has to use his country’s own currency (assuming he lives there). Maybe his country has bad politicians – but that’s not his fault and he can’t do anything about it.

Now what if George had his own economy, the GeorgeCoin?

(ok it’s an economy of one person but bear with me)

George’s country makes a bad economic decision, and overnight its currency is worth nothing, so food prices skyrocket. But GeorgeCoin’s value is still intact, because it’s separate.

It’s just like owning your house vs renting it.

This works because even if the country’s economy tanks, people still believe in George. You just need an efficient way of converting between everybody’s own Coins.

Categorized as Explain

Blockchains coordinate Humans into larger Organisms

The first time I had a hint that “as above, so below” was when learning how to trade crypto. Prices go up and down in waves, and there were waves that manifested themselves on the 10 minute chart, and waves that manifested themselves on a 1 day chart, and one could make money trading on both timeframes. That is, within the larger, longer term waves, there were smaller, short term waves. It’s like fractals that you can zoom infinitely into.

Recently I found this essay “Cognition all the way down” (archive), which proposes that even cells, genes, DNA are agents that are autonomous, who find their way through life, who sense opportunities and try to accomplish things.

Thinking of parts of organisms as agents, detecting opportunities and trying to accomplish missions is risky, but the payoff in insight can be large. Suppose you interfere with a cell or cell assembly during development, moving it or cutting it off from its usual neighbours, to see if it can recover and perform its normal role. Does it know where it is? Does it try to find its neighbours, or perform its usual task wherever it has now landed, or does it find some other work to do? The more adaptive the agent is to your interference, the more competence it demonstrates. When it ‘makes a mistake’, what mistake does it make? Can you ‘trick’ it into acting too early or too late? Such experiments at the tissue and organ level are the counterparts of the thousands of experiments in cognitive science that induce bizarre illusions or distortions or local blindness by inducing pathology, which provide clues about how the ‘magic’ is accomplished, but only if you keep track of what the agents know and want.

OK, so cells are actually selfish agents. How do they cooperate to form a cohesive whole, like a human who has no sense of his constituent cells? I’m just going to quote liberally from this article just to hammer home that you should really read it.

When two cells connect their innards, this ensures that nutrients, information signals, poisons, etc are rapidly and equally shared. Crucially, this merging implements a kind of immediate ‘karma’: whatever happens to one side of the compound agent, good or bad, rapidly affects the other side. Under these conditions, one side can’t fool the other or ignore its messages, and it’s absolutely maladaptive for one side to do anything bad to the other because they now share the slings and fortunes of life. Perfect cooperation is ensured by the impossibility of cheating and erasure of boundaries between the agents. The key here is that cooperation doesn’t require any decrease of selfishness. The agents are just as 100 per cent selfish as before; agents always look out for Number One, but the boundaries of Number One, the self that they defend at all costs, have radically expanded – perhaps to an entire tissue or organ scale.

Sounds just like relationships, doesn’t it? Would you want to connect your innards with somebody who hasn’t got their life together?

The other amazing thing that happens when cells connect their internal signalling networks is that the physiological setpoints that serve as primitive goals in cellular homeostatic loops, and the measurement processes that detect deviations from the correct range, are both scaled up. In large cell collectives, these are scaled massively in both space (to a tissue- or organ-scale) and time (larger memory and anticipation capabilities, because the combined network of many cells has hugely more computational capacity than the sum of individual cells’ abilities).

To paraphrase: a cell’s lifespan and goals are short, perhaps on the order of seconds or hours (don’t ask me I’m not a biologist). As more of them collect together, their biological feedback mechanisms interact such that their lifespan and goals are larger, whether it be in terms of time or space.

Doesn’t this remind you of large sea creatures, or tall trees hundreds of years old?

The cooperation problem and the problem of the origin of unified minds embodied in a swarm (of cells, of ants, etc) are highly related. The key dynamic that evolution discovered is a special kind of communication allowing privileged access of agents to the same information pool, which in turn made it possible to scale selves. This kickstarted the continuum of increasing agency. This even has medical implications: preventing this physiological communication within the body – by shutting down gap junctions or simply inserting pieces of plastic between tissues – initiates cancer, a localised reversion to an ancient, unicellular state in which the boundary of the self is just the surface of a single cell and the rest of the body is just ‘environment’ from its perspective, to be exploited selfishly. And we now know that artificially forcing cells back into bioelectrical connection with their neighbours can normalise such cancer cells, pushing them back into the collective goal of tissue upkeep and maintenance.

Let’s return to talking about blockchain now, remembering that Ralph Merkle first compared Bitcoin to a lifeform.

Recently, humans discovered that they can unite areas larger than towns with the notion of a nation-state. A nation state is a bigger organism than humans, can accomplish more and reach for bigger goals, yet some things are still the same. It’s got an organ that poses a direction called the government (the brain). A nation, just like a human, needs to maintain its boundaries with force and keep order internally (the immune system). And last but not least, a system of transferring value within itself, keeping its various parts fed and nourished (the blood). It’s called a currency.

If we think about it like this, it is only natural that countries stamp out alternative currencies like Bitcoin as soon as they exist, like the Wörgl Experiment in Austria. From their point of view, it is a cancer – a totally different organism. Case in point: the new STABLE ACT from the US is all about banning stablecoins. A coin that has exactly the same value as the USD, but isn’t under the US Treasury’s control? Obviously going to undermine them at some point, which is why this isn’t surprising at all.

Categorized as Explain

How token economies organize people around endeavours

When you think about it, Bitcoin is incredible. It got many people to pour lots of money into running a public goods system, a very important system that we all need – a monetary system.

Sure, there have been monetary systems before. But which one is valuable and easy to transport at the same time, yet easy to verify its authenticity? Not gold. Paper is supposed to be backed by actual reserves of (whatever’s valuable, usually gold), but who trusts the people running the reserve? Plus they all encourage you to spend (inflation), not save (deflation).

But we’re not here to debate monetary systems. We’re here to generalize the Bitcoin achievement into other subjects.

A Very Quick Overview of the Bitcoin Concept

The best way to make a system uncensorable (countries hate it when you issue your own currency, because it makes them irrelevant – just see Wörgl in Austria) is to spread the system across many computers, just like Bittorent.

But how do we coordinate many computers and get them to share the same state? Bitcoin says “might makes right”, or “whoever has the most computing power is correct”. Specifically, “whoever can take a bunch (block) of valid transactions and create a hash (fingerprint) that starts with n zeros in front is correct”, where n is adjusted for difficulty periodically.

Since you have to spend a lot of electricity and computational power to find such an answer, you probably aren’t trying to sabotage the Bitcoin system. Congratulations, the algorithm will reward you with some Bitcoin (12.5 BTC as of time of writing).

Basically, if you you do honest work (Proof of Work) for the system, the system will reward you with Bitcoin. Now how can we use this to reward other kinds of work?

Decentralized Computation/AWS Lambda: Ethereum

Proof of Work: Ethereum works just like Bitcoin. The only difference is while Bitcoin tokens have no use on Bitcoin other than transferring them around, Ether is actually useful. A developer can upload programs and other people who want to use them pay for the program to run with Ether.

Or they could just send them around like Bitcoin.

Decentralized CDN/cloud storage: Siacoin, Filecoin

Siacoin: Proof of Storage + Proof of Work: A user uses SIA to pay for a file he wants stored on the Siacoin network. This SIA is set aside in escrow. To earn the SIA, storage hosts regularly submit a random segment of the original file and a list of hashes of the other segments to prove that they’re still storing the file. Regular Bitcoin-style Proof of Work ensures that everybody in the Siacoin network agrees that these proofs are valid.

Filecoin: Proof of Storage + Byzantine “Expected Consensus”: Filecoin has a very different design – nevertheless, the idea is still the same if you zoom out enough. Storage hosts must stake FILC token as collateral in case they behave dishonestly. Every 24 hours, every storage host must submit a proof that they stored the data over a period of time. If they miss this deadline, their staked FILC is slashed and they won’t earn extra FILC for that round. Randomly chosen storage hosts validate each other and produce the chain in Filecoin’s Proof of Work replacement, called Expected Consensus.

Observe: the system wants to reward you for honest work, and wants to punish you for dishonest work.

It’s just that sometimes, it’s difficult to tell a computer what “honest work” exactly is.

Decentralized Video Transcoding Service: Livepeer

Proof of Stake + proof of correctly done work: since it is difficult and expensive to verify that a video was transcoded properly, Livepeer makes it difficult to be a Transcoder. To earn the right to transcode video on Livepeer, you put lots of LPT tokens on escrow (you can ask other people to contribute to this stake, i.e. delegate their tokens to you) and if you’re in the top n, the network trusts you enough to send your computer some transcoding work. Livepeer sends random segments of the video you encoded to a third party service on Ethereum called Truebit.

I don’t quite understand just how Truebit verifies the transcoded video without actually doing the same computation, though.

If Truebit finds that a Transcoder didn’t transcode anything, Livepeer will slash the Transcoder’s staked LPT tokens. To further increase peer pressure on the Transcoder to be honest, delegated tokens are also slashed.

Decentralized Spotify: Audius

What does one need to run a streaming music service? (let’s read their whitepaper) Audius defines a content service, which hosts the actual music, encrypted of course; and discovery service that keeps track of what content is out there.

Obviously if you run the content service, you could pirate the music that artists uploaded to your server. And obviously if you run the discovery service, you might try to game the system by unfairly promoting certain artists. That’s why you have to stake (put in escrow) at least 200,000 AUDIO for the network to recognize you as a content host, and the same for the discovery service. Artists can stake their own AUDIO tokens on you if they trust you not to pirate their music, or run their own services. If you misbehave, your tokens get slashed, just like in Livepeer.

However, a blockchain can’t tell if music was pirated, or if an artist was unfairly promoted. Audius therefore leaves these behaviour checking mechanisms to the community to call a vote on. This pattern is obviously not as reliable as having a computer check things, but we will definitely see more of such patterns in the future as we apply token economics to new use cases which computers cannot completely evaluate.

A new way to organize humans around endeavours

As you can see, having your own economy enables you to do what a company does – except that while a company motivates you with salaries in a national currency, in these new organizations you’re motivated by earning their own token.

And while in companies, humans approve your salary, in these new organizations the computers give out rewards – humans are only needed to verify that honest work was done.

You can’t pay for rent and food with these tokens, but it’s not like they’re useless either.

In fact, a group of people with their own economy can be thought of as its own living unit, its own organism. Isn’t that what a country is?

Categorized as Expound

Design Rationale behind the Commons Stack cadCAD refactor


Since March 2020 I’ve been working on the Commons Stack’s cadCAD simulation of the Commons, refactoring and redesigning it to become a part functional, part object oriented cadCAD simulation. This post documents the design rationale and thoughts I had in mind for the refactor, and establishes the desired design direction for the future.

What does this simulation simulate

The Commons (longer explanation here)is a group of Participants who can create and vote on Proposals, which receive funding and either fail or succeed. The Participants have vesting/nonvesting tokens depending on when they entered the simulation. The bonding curve controls the price, supply and the funding pool/collateral pool of the Commons. Participants have an individual sentiment, which can affect if they decide to stay in the Commons or exit completely, and if new Participants decide to join.

The original code was written by Blockscience’s resident math genius, Michael Zargham, where the relevant code was in, and conviction_cadCAD3.ipynb. I added explanations in my own fork, and started refactoring in coodcad. Later on this was merged with the Commons Stack game server backend repo in commons-simulator.

Participants and Proposals are represented as nodes in a structure depicted below, and the lines (edges) describe their relationships to each other. The general name for this type of data structure is a Directed Graph, where a Graph is basically a collection of nodes, and the “directed” just means that the edges describe a relationship “from this node to that node”. Alternately I just call it the “network”.

In this graph (not a Graph) only the Participants are depicted, with the lines representing their influence on each other.

Just some problems I had with the original code

The code badly needed “models”, like what we have in web server backend programming, and a layer that abstracted away the details of modifying the directed graph.

Or, in other words, we needed a concept of a “thing” instead of a collection of attributes, and the creation of a “thing” should be separated from “how to add that thing to the network” (which wasn’t happening).

Because everything used dictionary attributes, it was impossible for the linter to know anything about the types and thus point out programming errors.

def gen_new_participant(network, new_participant_holdings):
    i = len([node for node in network.nodes])


    s_rv = np.random.rand()
    network.nodes[i]['sentiment'] = s_rv

    for j in get_nodes_by_type(network, 'proposal'):
        network.add_edge(i, j)

        rv = np.random.rand()
        a_rv = 1-4*(1-rv)*rv #polarized distribution
        network.edges[(i, j)]['affinity'] = a_rv
        network.edges[(i,j)]['tokens'] = a_rv*network.nodes[i]['holdings']
        network.edges[(i, j)]['conviction'] = 0
        network.edges[(i,j)]['type'] = 'support'

    return network

As you can see, the attributes that make up a Participant are defined ad-hoc. This part of the code may know that a Participant is supposed to have ‘sentiment’ and ‘holdings’ but what about every other function that deals with Participant? If you decide to add a new attribute to a Participant, you’d have to update the code everywhere else and you don’t know if you did everything correctly (there are no tests), and the linter can’t help you. This is a good case for having a class or at least a struct, to be used as a model/”thing”.

The edges have the same problem, but I didn’t find it worthwhile to change that into a class so I just left them as dict attributes.

In the for loop, you can see that it is simply setting up the relationships between the newly added Participant and the Proposal. This is merely tangentially related to the original task of adding a Participant to the graph, and if one were to make changes to the Participant-Proposal relationship, one would not remember to update this function. This task of ensuring the new Participant has a relationship to every Proposal is best handled by a separate function.

The mechanisms that affect sentiment were spread between several policy and state update functions – it was impossible to keep track of when this variable was touched.

sentiment (the same word was used for overall sentiment as well as individual Participants’ sentiment) is mentioned in gen_new_participant, driving_process, complete_proposal, update_sentiment_on_completion, update_proposals, update_sentiment_on_release, participants_decisions, initialze_network. Notice how the function names imply that sentiment simulation is highly integrated into the simulation and not something you can just “turn on/off”. Furthermore you can’t tell by the name of the function whether it uses sentiment or not. In the long term this is unmaintainable.

Upon trying to add an exit tribute (or anything) to the code, it was not easy to see where it should be added, and if adding it would break something else. This was a natural consequence of all these little problems building up.

To test if the code worked, I had to run the entire simulation. There were no unit tests, no function could run independently of cadCAD.

This also made it impossible to know if everything was working as intended, or if some happy coincidences happened (because Participants’ actions are random) that didn’t crash the simulation this time – or worse, continues running but does something completely unintended.

Even if the simulation worked, I wasn’t confident it was doing what was intended. Yet another natural consequence. In fact I found some things (which I’ve long since forgotten) that weren’t actually used, or working as expected.

Why is the new code designed the way it is?

As you could see already, Participant and Proposal needed to become “objects/things” as opposed to a random collection of attributes (that may not be manipulated consistently by the code). This helps pylint check that you’re manipulating the attributes correctly. Let the computer check for programming errors as much as possible with the linter so that I don’t have to run the simulation to see if things work or not.

When you start thinking about Participants and Proposals as things, you can start to imagine how you could, instead of a function that says “randomly do this action 30% of the time”, you could simulate Participants as individuals: “randomly do this action x% of the time based on this Participant’s personality, which can be profit-seeking/altruistic/balanced”.

The ultimate goal is to put a neural network in the Participant class, tell it to optimize for profit/self satisfaction, and see what behaviour emerges.

TokenBatch badly needed to be a class in order to implement token vesting. A TokenBatch keeps track of vested and unvested tokens, and how many of the vested tokens were spent once unlocked.

Implementing TokenBatch as a class enabled lots of convenience functions that would simplify code using it.

AugmentedBondingCurve, Commons followed the same logic: once I make changes to something, I really don’t want to think about the details of what’s going on under the hood. I just want it to work. Plus having them as “things” fits how people naturally think about the concepts.

It was quite important that the structure of the code be self-explanatory and not require prior understanding. Having the Commons as a class, a thing, even though it meant running some extra state update functions to copy token_supply, funding_pool, collateral_pool out of it so that cadCAD could access them easily, facilitated this.

At the same time, the functional principles (it should be easy to see that a function cannot cause unintended side effects; functions are things that modify data, which remains unchanged otherwise) that cadCAD was based on are still valuable.

Classes should be used to abstract complexity away and fit the code to how a normal human might think of the concepts. For everything else, stick to the functional paradigm.

Walking Through The Codebase is where the cadCAD simulation is setup., simrunner.ipynb are simply frontends to run this from the CLI or Jupyter notebook. Here live all the policies. They are just simple functions, but there are so many of them, so they’re grouped under classes with @staticmethod. When writing or changing them, do create a corresponding unit test in

The idea is if you decide one day that there should be a different way of deciding whether a new Participant joins the Commons, you create a p_desc_of_your_strategy_here() policy under GenerateNewParticipant. Please, don’t just edit p_randomly(). Proposal and Participant live here.

Normally in cadCAD system-dynamics style simulations the policy functions decide what happens in the simulation, but the code here is more of an ‘agent based modeling’ simulation so the policies also ask the Proposals and Participants what they will do.

System dynamics style: There is a 30% chance of a new Proposal being created every round. Assign it to a random Participant.

Agent based style: Each Participant is asked if they want to create a new Proposal – they have a 30% chance of saying yes.

Since Proposal, Participant have tunable behaviour and thus need configuration parameters, but cannot (or maybe, should not? – because the concept of a Proposal/Participant is unrelated to cadCAD in general) know about the existence of cadCAD and its params variable, their tunable constants are defined in a different place,

I mentioned before that one reason to have classes is to allow the linter to introspect operations and warn us if we’re doing something stupid. In practice this didn’t happen so much because we still have to use network.nodes[0]["item"] which could be anything. It could be useful to replace all instances of network.nodes[0]["item"] with a function from the data layer called get_participant(idx: int) -> Participant, which will make it clear to the linter that we are going to operate on a Participant now. This is the data layer. It translates business logic operations to network.DiGraph operations.

Simply put, when we want to add a participant to the network, that’s all we want to think about – we don’t want to have to remember “this is how to use network.DiGraph to accomplish this; oh and remember to setup the influence and support edges for this new Participant”.

As time goes by this has become a huge collection of assorted functions which has become a pain to import everywhere – I’ve been considering rolling them into a class with methods, but I need to see how much it disturbs the cadCAD functional paradigm. For now it’s not such a big problem. As mentioned before, it could use a simple get_participant(idx: int) -> Participant type of function which not only informs the linter, but does a type check so it can guarantee we’re not actually operating on a Proposal.

Quirk 1: object oriented weirdness in

initial_conditions = {
        "network": network,
        "commons": commons,
        "funding_pool": commons._funding_pool,
        "collateral_pool": commons._collateral_pool,
        "token_supply": commons._token_supply,
        "policy_output": None,
        "sentiment": 0.5

This is the initial state condition of the simulation, which is available later in the state update functions as s. As you can see, three of these state variables are actually copied from commons, and if we were to proceed as normal like in most cadCAD simulations, these variables would get out of date. Which is why whenever commons changes, we need to copy them out from commons into s with these state update functions – notice how update_avg_sentiment() is similar.

def update_collateral_pool(params, step, sL, s, _input):
    commons = s["commons"]
    s["collateral_pool"] = commons._collateral_pool
    return "collateral_pool", commons._collateral_pool

def update_token_supply(params, step, sL, s, _input):
    commons = s["commons"]
    s["token_supply"] = commons._token_supply
    return "token_supply", commons._token_supply

def update_funding_pool(params, step, sL, s, _input):
    commons = s["commons"]
    s["funding_pool"] = commons._funding_pool
    return "funding_pool", commons._funding_pool

def update_avg_sentiment(params, step, sL, s, _input):
    network = s["network"]
    s = calc_avg_sentiment(network)
    return "sentiment", s

Quirk 2: two state update functions depend on the output of one policy but change the same variable

        "policies": {
            "which_proposals_should_be_funded": ProposalFunding.p_compare_conviction_and_threshold
        "variables": {
            "network": ProposalFunding.su_make_proposal_active,
            "commons": ProposalFunding.su_deduct_funds_from_funding_pool,
            "policy_output": save_policy_output,
        "policies": {},
        "variables": {
            "network": ParticipantExits.su_update_sentiment_when_proposal_becomes_active,

The decisions made in ProposalFunding.p_compare_conviction_and_threshold are needed by ProposalFunding.su_make_proposal_active and ParticipantExits.su_update_sentiment_when_proposal_becomes_active, which both update the same state variable, network. This makes things awkward, because even though I could combine them into one function, the truth is they are separate mechanisms and this would make the code untidy.

The solution is to save the output of the policy into the state variables dict s instead, so that the result of ProposalFunding.p_compare_conviction_and_threshold is still available outside of that state update block.

def save_policy_output(params, step, sL, s, _input):
    return "policy_output", _input

Summary and Intentions

All code documentation runs the risk of being quickly outdated – but the reason I wrote this post was to keep the original principles behind the code clear even as it changes in the future. Plus, a snapshot in time that explains context is always useful.

What is Commons Stack and the Token Engineering Commons?

One of the first organizations I heard about in the token engineering space was the Commons Stack. It wasn’t really a company per se, it was more of an organization, one that, like most blockchain layer 1 projects, was spread throughout the globe with no center.

Remember the 2016 DAO that forked Ethereum? Some of the Commons Stack members were involved there too.

Anyway, the Commons Stack is a community designing a type of decentralized autonomous organization called a Commons. A Commons organization uses new token engineering concepts to make donating to public goods more than just a donation.

Wait but why

Let’s take a short break from all this new age blockchain stuff to come back to the real world. Society, or should I say, the current economy, doesn’t really reward certain things for the value they provide.

The environment, for example, is constantly being abused. Most plastic isn’t really recycled, but is marketed as being recyclable so consumers will continue buying it. Think your electronic devices are being recycled? Think again, it gets shipped by the container into other countries’ landfills. Nobody’s got time to separate the components into their reusable raw materials, and even if they did, how could they compete with the rate finished products are being produced? Oceans are being polluted, forests cut down, you name it. And it’s just cheaper to keep doing so – the economic incentive is to continue abusing the environment.

Open source software is another thing that is constantly undervalued and abused, unless a successful capitalist company chooses to sponsor its development (because it relies heavily on it). Amazon is well known for hosting open source software, profiting heavily from it, and not giving back to the software project. And while I was working at a fintech startup, the Theo de Raadt (of OpenSSH, OpenBSD) came calling. Apparently he was looking to save fees on donations.

Imagine, the author of the world’s most secure OS and software that everybody uses to control servers, in a financial state where he has to worry about transaction fees.

I could go on, but this dynamic arises because of 2 things:

  1. making money generally involves controlling a scarce resource, making sure nobody else has access to it, and thereby charging money for it – hence patents, research silos, walled gardens, closed behaviour and abuse of free resources. After all, if there’s something free that you can do anything with, you’d just take it, right?
  2. There are many types of intangible values – reputation, power, trust, stability, the happiness of a community, the health of an environment/ecosystem, animal welfare. And you can’t put a number on any of these. We can only describe one type of value – monetary, and this disfigures our humanity and makes us do terrible things.

The Idea Behind a Commons

Funding: Augmented Bonding Curve

Much like Bitcoin, having your own token can serve as funding and denote membership. If you do work for the Bitcoin network, you get rewarded in BTC; if you do work for a Commons, you get paid in its token, let’s call it CTOKEN.

Since this is too important to trust humans with, we let a program handle the problem of token supply, just like Bitcoin. It sits on a blockchain and thus is tamper-resistant. We call it a bonding curve.

What a Bonding Curve Does

  1. When you put in USD, the bonding curve creates new CTOKEN, thus increasing the total supply.
  2. When you want to take out USD, the bonding curve destroys (“burns”) that CTOKEN, thus decreasing the total supply.

So it converts from one form of value (USD) to another (CTOKEN), except that it also decides the price of CTOKEN based on a curve. Besides determining the supply and converting a form of value, it also solves the problem of matchmaking when someone wants to buy and there’s no one willing to sell/vice versa.

USD coming into the Bonding Curve is split into 2 pools:

  • Funding pool: pays the people running the Commons
  • Collateral/reserve pool: if someone wants to sell his tokens, pay him USD from this pool

As you can see, the reserve/collateral pool backs the value of the token. If you sell your CTOKEN to the bonding curve, it will destroy (“burn”) the CTOKEN and you’ll get USD back. Specifically, you’ll get more USD back than the next person who sells after you, because it is a Bonding Curve after all, and the collateral pool is simply a fraction of all the money that ever got invested in the Commons.

x-axis: DAI/USD deposited; y-axis: total token supply.
Mess with this too much and it becomes a Ponzi scheme!

The combination of the bonding curve and these two pools is called an Augmented Bonding Curve.

Decision making

Those who have tokens can vote. Those who have more tokens have more voting power. But what if your vote on a particular proposal’s power also increased over time, discouraging people from switching at the very last minute? This is called Conviction Voting.

In the Commons, people vote (using CTOKEN) on which projects should receive how much funding (also denoted in CTOKEN). And yes, you can vote on yourself, to say that you should receive funding!

Test Driving the Concept: The TE Commons

1 DAI is equal to 1 USD – this peg is kept by the MakerDAO

Of course, there is a need to test if such an organization would work, and how people might game the system! That’s the Token Engineering Commons, a Commons that funds projects in the Token Engineering space.

Here’s how it should play out

  1. Donors put money into the TE Commons, and get TEC from the bonding curve.
  2. Donors vote upon projects using TEC, and projects receive that TEC as funding. To actually pay themselves, they sell the TEC for USD, but not all of it! Why? So that they can vote for themselves in the future, to steer funds their way!
  3. End result: the price of the token gets pushed up (remember the bonding curve), and for project owners, it is a fine balance between retaining enough voting power and getting enough funds.

But this is just the worst case scenario.

What if an economy formed around the TEC token, giving it extra value beyond voting rights?

Let’s go back in time. In the beginning there was Bitcoin. Developers poured their time into it, but there was nobody paying them. Instead, if you ran the Bitcoin node (participated), you had a chance of being paid in BTC. That was all you got! Within that niche group of people, Bitcoin had some value – worth a pizza, maybe. Outside of that circle, Bitcoin was worthless. Today projects still pay with their own coin. If you work for Decred, for example, you will only get paid in DCR, funded from the 10% mining split.

As outside people slowly started to perceive Bitcoin as having value, projects like Ethereum launched ICOs. Now, the Ethereum Foundation has 2 kinds of reserves: BTC and ETH, and it can sell both for fiat to pay its developers. ETH used to be worthless, just 5 USD. Today the Ethereum Foundation can send ETH around to fund projects because it’s worth 400 USD.

Today, people in economically unstable countries already perceive Bitcoin as digital gold – they transact in it every day. Wall Street in particular is starting to see Bitcoin as a great store of value that retains its purchasing power even as more USD is printed.

One day, perhaps Ethereum will be perceived as digital oil – something you need to buy to power your interactions with web3 apps. Sure, Web 2.0 was free – but not really, you were paying for it with your privacy, and those companies won’t have an incentive to care about you when they get big enough. Just look at Google, Apple, Amazon, Facebook.

Much of this, of course, has to do with perception, selling yourself, being able to convince people to join your project. Decred or Litecoin or any other coin with a fixed supply and healthy consensus mechanism could also be a store of value, and you can hold DeFi, ICOs or STOs or whatever they’re called today on Tezos, EOS, aeternity, NEO, NEM etc as well. But people flock to Bitcoin because it was the first, has the largest community, investment and hashpower behind it. Same for Ethereum.

What could people perceive a Commons Token as? I have no idea – but as you can see, the future is going to be a strange one.