Preparing Data for a Mere Simulation is Harder Than You’d Think

Balancer Simulations is an open-source cadCAD model of a Balancer V1 AMM smart contract. This is useful for exploring “what if” questions involving an AMM pool e.g. “as an arbitrageur, how could I have made the most profit with this particular AMM pool (given historical data)?” or “how high should I set the pool fees to provide a decent ROI to liquidity providers?”. As a fundamental DeFi composing block, AMMs will be increasingly used to encourage certain behaviour – therefore it is important to find the correct parameters.


The simulation is coded to be identical to the original smart contract deployed on Ethereum, so that one can “play” with it and derive some insights. But we still needed a way to feed in user actions to the simulation, reconstructed from what actually happened on Ethereum.

The simulation doesn’t know about transactions or gas fees. Instead, it feeds on user Actions – a user creates the pool; another user joins (deposits tokens into) the pool, providing liquidity; another user swaps token A for token B, paying a fee to the pool for doing so; another user exits, taking out all the tokens he deposited in the pool + a proportion of the pool’s collected fees. These “events”, or what we called “Actions” in the data pulling script, had to be reconstructed from Ethereum log events.

Getting Ethereum log Events

It’s not so easy to get historical data from Ethereum. You need access to an archive node, and an archive node takes at least 4TB of flash storage and a week or two to sync. That’s assuming everything goes well. Plus, when you finally get data out of it, it’s not organized.

Thankfully some people have been putting historical Balancer pool events on a traditional SQL database, which can be queried easily:

select * from blockchain-etl.ethereum_balancer.BFactory_event_LOG_NEW_POOL where pool="0x..."give me all LOG_NEW_POOL events for the pool 0x….. (since a pool is only created once, there can only be one row from this SQL query)
select * from blockchain-etl.ethereum_balancer.BPool_event_LOG_JOIN where contract_address="0x..." order by block_numbergive me all the events where a user added token liquidity to the pool, and sort them by the block in which it happened
select * from blockchain-etl.ethereum_balancer.BPool_event_LOG_SWAP where contract_address="0x..." order by block_numbergive me all the events where a user swapped a token out for another using this pool 0x…, sorted by the block in which it happened

We do this for LOG_NEW_POOL, LOG_SWAP, LOG_JOIN, LOG_EXIT, event_Transfer, fee changes and weight changes (which were too complex to fit into one line of SQL). And afterwards we smush them all together into one sorted list of Events.

“Why didn’t you just get all the Events sorted by block number, you dummy?”
It just wasn’t possible the way the SQL data was organized. Besides, fee changes and weight changes had to be derived, they weren’t actual log Events.

Great, so we don’t need an Ethereum archive node after all, right? Wrong – there is a special anonymous Event emitted in addition to LOG_JOIN, LOG_SWAP, LOG_EXIT that has important information that can affect the simulation accuracy.

Because Ethereum transactions may or may not go through, and tokens in the pool might change in the meantime, you might not get back exactly the amount of tokens you were expecting.

For example, if you swap 1 TOKEN-A for 500 TOKEN-B, you might get 499 or 501 TOKEN-B. Fortunately there are variants of JOIN/SWAP/EXIT methods which let the user decide if he wants to spend exactly this much TOKEN-A, or if getting back exactly 500 TOKEN-B is more important to him.

Unfortunately, this important information was not included in the SQL database, so we needed an Ethereum archive node after all, and fellow developer Raul spent at least 2 nights deciphering this important information from the anonymous Event.

Put all the Events together into a list and group by txhash

events = []
events.extend(turn_events_into_actions(new_events, fees_dict, denorms_results))
events.extend(turn_events_into_actions(join_events, fees_dict, denorms_results))
events.extend(turn_events_into_actions(swap_events, fees_dict, denorms_results))
events.extend(turn_events_into_actions(exit_events, fees_dict, denorms_results))
events.extend(turn_events_into_actions(transfer_events, fees_dict, denorms_results))

events_grouped_by_txhash = {}
for i, action in enumerate(events):
    tx_hash = events[i].tx_hash
    if events_grouped_by_txhash.get(tx_hash) is None:
        events_grouped_by_txhash[tx_hash] = []
# save_pickle(events_grouped_by_txhash, f'{args.pool_address}/events_grouped_by_txhash.pickle')
# events_grouped_by_txhash = load_pickle(f'{args.pool_address}/events_grouped_by_txhash.pickle')


Multiple log Events could actually have been emitted by a single Ethereum transaction. So now, given NEW, JOIN, SWAP, EXIT, and Transfer (pool shares, not the tokens) Events, we want to reconstruct the transactions that were relevant to this particular pool.

The above code simply smushes the events together into a long, unsorted list, and groups them by txhash.

Dirty detail: It says turn_events_into_actions() and it even uses the Action class in data/, but actually they are not yet real Actions, they are still individual Events. That’s because when I wrote the code, I intended to make them Actions, but many other problems came up and I quickly forgot my original intention.

Exception: irregularities in the data caused by 1inch aggregated swaps

We were getting token swaps that didn’t make sense. Which BAL got turned into WBTC, and which into WETH? It is not clear.

"action": {
    "type": "swap",
    "tokens_in": [
            "amount": "447.23532971026",
            "symbol": "BAL"
            "amount": "157.26956649152",
            "symbol": "BAL"
    "tokens_out": [
            "amount": "7279711",
            "symbol": "WBTC"
            "amount": "6.450635831831913964",
            "symbol": "WETH"

As it turns out, this is the work of smart hacks working at to save on gas fees and aggregate swaps into a single transaction. So we had to modify our data parsing script to recognize transactions like these and emit two Actions instead of one.

Turn Events into simulation-relevant Actions

# Remove pool share transfers
grouped_events = list(filter(lambda acts: not (len(acts) == 1 and acts[0].action_type == 'transfer'), grouped_events))

actions = stage3_merge_actions(args.pool_address, grouped_events)

# save_pickle(actions, f"{args.pool_address}/actions.pickle")
# actions = load_pickle(f"{args.pool_address}/actions.pickle")

We remove pool share transfers because they are irrelevant to the simulation (the number of pool shares is a consequence of the inputs, not something we should feed into the simulation).

stage3_merge_actions is where we take Events and merge them into Actions.

  • Yes, it should be called stage3_merge_events_into_actions. Naming is hard.
  • stage3_merge_actions() doesn’t even use the class, which I originally intended to be used here. Oh well.
  • stage3_merge_actions() is also where we ask the Ethereum archive node for the anonymous Event, decipher it and add its data into the “Action”. This should actually belong in section 1, where we get the different Event types from the SQL database, but the code is the way it is.

Interleave hourly prices between the Actions

Since I wrote the part that gets prices from Coingecko API, I’ll explain that here.
As long as you only request 3 months of data, Coingecko gives historical hourly prices. For free. However, this only goes a ways back – you won’t get hourly pricing data for 2018 even if you request 1 day at a time. This inconsistency is conveniently not mentioned on the Coingecko API page.

The hourly pricing data that Coingecko returns is not regular either – you might get

2021-01-01 00:47:33 1000
2021-01-01 01:33:21 1001
2021-01-01 02:17:05 999

which is worrysome. Now I have to round the timestamps to the nearest hour, and 01:33:21 rounds to 02:00:00, but 02:17:05 also rounds to 02:00:00! So I’ll have to throw something away.

Then again, who’s to say other pricing data services like Tradingview aren’t doing this in the background either?

Lesson Learned

Franz Kafka was known to find faults in anything that came from his own pen, except for a chosen few, amongst them The Judgment, which he supposedly wrote in a single 8 hour cohesive sitting, and was the first short story he was truly proud of.

That was also how I wrote the data ingestion script for the simulation. It was a beautiful, cohesive solution that fit very well to the problem.

But the problem changed. The 1inch aggregated swap problem came up. Prices had to be added. The archive node had to be queried for additional information, and nobody had time to rewrite everything. Over time, it became a jumbled mess, far from the elegant solution I had envisioned.

Kafka knew what he wanted to express, and it stayed the same. But as a programmer, we think we know the problem, but we don’t. It changes, or time will show us our understanding was wrong, or incomplete. Hence:

As a programmer, prefer the flexible solution, not the most beautiful/elegant solution.

me, who else

cadCAD can’t simulate humans

Now and then, when people hear I code economic simulations with cadCAD (and I just heard about its rust cousin radCAD), they want me to write one for them. And it usually involves asking how humans would behave in a specific situation.

here’s the thing

I don’t know.

I can’t simulate human psychology.

What would people do in this or that case? This is not what a simulation does. A simulation says “given these conditions, and behaviours, this will happen (most/some of the time)”. One must be very clear about what these behaviours are and formulate the question in such a way that we can answer it without having to program mini-humans. Mini-humans are impossible to verify anyway.

Bad question: “What would humans do if I raised taxes?”
Better question: “How many people would stay in my system given a group of humans (with income distribution A, differing tolerances for tax raises B) and if I raised taxes above a certain threshold?”

You get the idea.

Math helps to set your intentions in stone. If you can express every interaction and change as an equation, everyone can verify that the simulation is working as intended. After all, if I write a complex system that I claim simulates how humans would behave, how is anybody going to verify that? I was probably just guessing; and it’s hard to verify if the code does what I intended. It’s far easier to test if an equation computed the right result.

Design Rationale behind the Commons Stack cadCAD refactor


Since March 2020 I’ve been working on the Commons Stack’s cadCAD simulation of the Commons, refactoring and redesigning it to become a part functional, part object oriented cadCAD simulation. This post documents the design rationale and thoughts I had in mind for the refactor, and establishes the desired design direction for the future.

What does this simulation simulate

The Commons (longer explanation here)is a group of Participants who can create and vote on Proposals, which receive funding and either fail or succeed. The Participants have vesting/nonvesting tokens depending on when they entered the simulation. The bonding curve controls the price, supply and the funding pool/collateral pool of the Commons. Participants have an individual sentiment, which can affect if they decide to stay in the Commons or exit completely, and if new Participants decide to join.

The original code was written by Blockscience’s resident math genius, Michael Zargham, where the relevant code was in, and conviction_cadCAD3.ipynb. I added explanations in my own fork, and started refactoring in coodcad. Later on this was merged with the Commons Stack game server backend repo in commons-simulator.

Participants and Proposals are represented as nodes in a structure depicted below, and the lines (edges) describe their relationships to each other. The general name for this type of data structure is a Directed Graph, where a Graph is basically a collection of nodes, and the “directed” just means that the edges describe a relationship “from this node to that node”. Alternately I just call it the “network”.

In this graph (not a Graph) only the Participants are depicted, with the lines representing their influence on each other.

Just some problems I had with the original code

The code badly needed “models”, like what we have in web server backend programming, and a layer that abstracted away the details of modifying the directed graph.

Or, in other words, we needed a concept of a “thing” instead of a collection of attributes, and the creation of a “thing” should be separated from “how to add that thing to the network” (which wasn’t happening).

Because everything used dictionary attributes, it was impossible for the linter to know anything about the types and thus point out programming errors.

def gen_new_participant(network, new_participant_holdings):
    i = len([node for node in network.nodes])


    s_rv = np.random.rand()
    network.nodes[i]['sentiment'] = s_rv

    for j in get_nodes_by_type(network, 'proposal'):
        network.add_edge(i, j)

        rv = np.random.rand()
        a_rv = 1-4*(1-rv)*rv #polarized distribution
        network.edges[(i, j)]['affinity'] = a_rv
        network.edges[(i,j)]['tokens'] = a_rv*network.nodes[i]['holdings']
        network.edges[(i, j)]['conviction'] = 0
        network.edges[(i,j)]['type'] = 'support'

    return network

As you can see, the attributes that make up a Participant are defined ad-hoc. This part of the code may know that a Participant is supposed to have ‘sentiment’ and ‘holdings’ but what about every other function that deals with Participant? If you decide to add a new attribute to a Participant, you’d have to update the code everywhere else and you don’t know if you did everything correctly (there are no tests), and the linter can’t help you. This is a good case for having a class or at least a struct, to be used as a model/”thing”.

The edges have the same problem, but I didn’t find it worthwhile to change that into a class so I just left them as dict attributes.

In the for loop, you can see that it is simply setting up the relationships between the newly added Participant and the Proposal. This is merely tangentially related to the original task of adding a Participant to the graph, and if one were to make changes to the Participant-Proposal relationship, one would not remember to update this function. This task of ensuring the new Participant has a relationship to every Proposal is best handled by a separate function.

The mechanisms that affect sentiment were spread between several policy and state update functions – it was impossible to keep track of when this variable was touched.

sentiment (the same word was used for overall sentiment as well as individual Participants’ sentiment) is mentioned in gen_new_participant, driving_process, complete_proposal, update_sentiment_on_completion, update_proposals, update_sentiment_on_release, participants_decisions, initialze_network. Notice how the function names imply that sentiment simulation is highly integrated into the simulation and not something you can just “turn on/off”. Furthermore you can’t tell by the name of the function whether it uses sentiment or not. In the long term this is unmaintainable.

Upon trying to add an exit tribute (or anything) to the code, it was not easy to see where it should be added, and if adding it would break something else. This was a natural consequence of all these little problems building up.

To test if the code worked, I had to run the entire simulation. There were no unit tests, no function could run independently of cadCAD.

This also made it impossible to know if everything was working as intended, or if some happy coincidences happened (because Participants’ actions are random) that didn’t crash the simulation this time – or worse, continues running but does something completely unintended.

Even if the simulation worked, I wasn’t confident it was doing what was intended. Yet another natural consequence. In fact I found some things (which I’ve long since forgotten) that weren’t actually used, or working as expected.

Why is the new code designed the way it is?

As you could see already, Participant and Proposal needed to become “objects/things” as opposed to a random collection of attributes (that may not be manipulated consistently by the code). This helps pylint check that you’re manipulating the attributes correctly. Let the computer check for programming errors as much as possible with the linter so that I don’t have to run the simulation to see if things work or not.

When you start thinking about Participants and Proposals as things, you can start to imagine how you could, instead of a function that says “randomly do this action 30% of the time”, you could simulate Participants as individuals: “randomly do this action x% of the time based on this Participant’s personality, which can be profit-seeking/altruistic/balanced”.

The ultimate goal is to put a neural network in the Participant class, tell it to optimize for profit/self satisfaction, and see what behaviour emerges.

TokenBatch badly needed to be a class in order to implement token vesting. A TokenBatch keeps track of vested and unvested tokens, and how many of the vested tokens were spent once unlocked.

Implementing TokenBatch as a class enabled lots of convenience functions that would simplify code using it.

AugmentedBondingCurve, Commons followed the same logic: once I make changes to something, I really don’t want to think about the details of what’s going on under the hood. I just want it to work. Plus having them as “things” fits how people naturally think about the concepts.

It was quite important that the structure of the code be self-explanatory and not require prior understanding. Having the Commons as a class, a thing, even though it meant running some extra state update functions to copy token_supply, funding_pool, collateral_pool out of it so that cadCAD could access them easily, facilitated this.

At the same time, the functional principles (it should be easy to see that a function cannot cause unintended side effects; functions are things that modify data, which remains unchanged otherwise) that cadCAD was based on are still valuable.

Classes should be used to abstract complexity away and fit the code to how a normal human might think of the concepts. For everything else, stick to the functional paradigm.

Walking Through The Codebase is where the cadCAD simulation is setup., simrunner.ipynb are simply frontends to run this from the CLI or Jupyter notebook. Here live all the policies. They are just simple functions, but there are so many of them, so they’re grouped under classes with @staticmethod. When writing or changing them, do create a corresponding unit test in

The idea is if you decide one day that there should be a different way of deciding whether a new Participant joins the Commons, you create a p_desc_of_your_strategy_here() policy under GenerateNewParticipant. Please, don’t just edit p_randomly(). Proposal and Participant live here.

Normally in cadCAD system-dynamics style simulations the policy functions decide what happens in the simulation, but the code here is more of an ‘agent based modeling’ simulation so the policies also ask the Proposals and Participants what they will do.

System dynamics style: There is a 30% chance of a new Proposal being created every round. Assign it to a random Participant.

Agent based style: Each Participant is asked if they want to create a new Proposal – they have a 30% chance of saying yes.

Since Proposal, Participant have tunable behaviour and thus need configuration parameters, but cannot (or maybe, should not? – because the concept of a Proposal/Participant is unrelated to cadCAD in general) know about the existence of cadCAD and its params variable, their tunable constants are defined in a different place,

I mentioned before that one reason to have classes is to allow the linter to introspect operations and warn us if we’re doing something stupid. In practice this didn’t happen so much because we still have to use network.nodes[0]["item"] which could be anything. It could be useful to replace all instances of network.nodes[0]["item"] with a function from the data layer called get_participant(idx: int) -> Participant, which will make it clear to the linter that we are going to operate on a Participant now. This is the data layer. It translates business logic operations to network.DiGraph operations.

Simply put, when we want to add a participant to the network, that’s all we want to think about – we don’t want to have to remember “this is how to use network.DiGraph to accomplish this; oh and remember to setup the influence and support edges for this new Participant”.

As time goes by this has become a huge collection of assorted functions which has become a pain to import everywhere – I’ve been considering rolling them into a class with methods, but I need to see how much it disturbs the cadCAD functional paradigm. For now it’s not such a big problem. As mentioned before, it could use a simple get_participant(idx: int) -> Participant type of function which not only informs the linter, but does a type check so it can guarantee we’re not actually operating on a Proposal.

Quirk 1: object oriented weirdness in

initial_conditions = {
        "network": network,
        "commons": commons,
        "funding_pool": commons._funding_pool,
        "collateral_pool": commons._collateral_pool,
        "token_supply": commons._token_supply,
        "policy_output": None,
        "sentiment": 0.5

This is the initial state condition of the simulation, which is available later in the state update functions as s. As you can see, three of these state variables are actually copied from commons, and if we were to proceed as normal like in most cadCAD simulations, these variables would get out of date. Which is why whenever commons changes, we need to copy them out from commons into s with these state update functions – notice how update_avg_sentiment() is similar.

def update_collateral_pool(params, step, sL, s, _input):
    commons = s["commons"]
    s["collateral_pool"] = commons._collateral_pool
    return "collateral_pool", commons._collateral_pool

def update_token_supply(params, step, sL, s, _input):
    commons = s["commons"]
    s["token_supply"] = commons._token_supply
    return "token_supply", commons._token_supply

def update_funding_pool(params, step, sL, s, _input):
    commons = s["commons"]
    s["funding_pool"] = commons._funding_pool
    return "funding_pool", commons._funding_pool

def update_avg_sentiment(params, step, sL, s, _input):
    network = s["network"]
    s = calc_avg_sentiment(network)
    return "sentiment", s

Quirk 2: two state update functions depend on the output of one policy but change the same variable

        "policies": {
            "which_proposals_should_be_funded": ProposalFunding.p_compare_conviction_and_threshold
        "variables": {
            "network": ProposalFunding.su_make_proposal_active,
            "commons": ProposalFunding.su_deduct_funds_from_funding_pool,
            "policy_output": save_policy_output,
        "policies": {},
        "variables": {
            "network": ParticipantExits.su_update_sentiment_when_proposal_becomes_active,

The decisions made in ProposalFunding.p_compare_conviction_and_threshold are needed by ProposalFunding.su_make_proposal_active and ParticipantExits.su_update_sentiment_when_proposal_becomes_active, which both update the same state variable, network. This makes things awkward, because even though I could combine them into one function, the truth is they are separate mechanisms and this would make the code untidy.

The solution is to save the output of the policy into the state variables dict s instead, so that the result of ProposalFunding.p_compare_conviction_and_threshold is still available outside of that state update block.

def save_policy_output(params, step, sL, s, _input):
    return "policy_output", _input

Summary and Intentions

All code documentation runs the risk of being quickly outdated – but the reason I wrote this post was to keep the original principles behind the code clear even as it changes in the future. Plus, a snapshot in time that explains context is always useful.