Metamind: Meta Goals für Alles

By: dreev and bee
Spec level:
Last updated: 2022-12-23
Gissue: #2771

Changelog

2022-12-23: spec for phase 1 at the end of this document
2022-08-15: more open questions and use cases

The Metamind integration means treating Beeminder itself as an official integration where the metric is “number of datapoints added on any of a list of Beeminder graphs”. It’s like Trello or URLminder where you can specify an arbitrary list of boards/URLs that all count towards the metric.

Modeling it on Trello it would look vaguely like this:

Mockup of the Metamind goal creation page based on Trello

But even easier, we’ll model it on URLminder like so:

Mockup of the Metamind goal creation page based on URLminder

You just paste Beeminder URLs into that text area. (We could even get a form of group goals this way, if we allowed specifying other people’s goals in that list. But we won’t, at least not for now!)

Use Cases

Weigh-ins goals work beautifully with this MVP. Just create a weighins goal that counts datapoints from your weight goal. Voila.
Committing to using Beeminder! You create a meta goal to count datapoints from all your other goals. Then no matter how conservative those other goals are, you can commit to entering data on all or most of them every day.
An interesting Quantified Self metric: how many datapoints you’re entering.

Open Questions

[NON-MENDOZA] Do we want an option for “all goals except this one”? That could be nice for a meta goal of “keep using Beeminder”, which you may want to create before you’ve even created any other goals. And it may be tedious to keep updating the list of goals feeding your meta goal as you add new goals.
[NON-MENDOZA] What about fancy things like IFTTT macros let you do? (Is this just a special case of or generalization of aggday?) What’s the next most common use case that needs more than just number of datapoints added on other goals?
Should we put these stats like total datapoints added publicly on users’ profile pages or anything?
Presumably we’ll want to fetch these things from the API. How do we design those API endpoints? Let’s make it easier on ourselves than fetching the whole list of datapoints and counting them.
Should it only count one datapoint per goal per day? Default: no, that’s a bit magical / an extra if-statement. And better to err on the side of generosity and count more things rather than less.
Should it only count kinetic (nonfrozen) goals? Default: same answer.
What about the use case of mirroring one goal’s data to another goal? Should that be the fundamental thing that the Metamind integration does because that’s what’s most general? If you want to just count datapoints from other goals you can mirror them and set aggday to count.
But setting a custom aggday is pretty advanced so maybe Metamind should offer a choice of metrics: number of datapoints or raw datapoints. You could choose which one with a dropdown just like integrations like Strava where you pick from a list of metrics. Since a weigh-ins goal is the most canonical, simple use case, let’s say number of datapoints is the only metric for the MVP and then we can add raw datapoints after that.
How do we ensure people don’t create infinite loops?
Adam’s argument that keeping the data in sync between ur-goals and meta-goals is easier with mirroring. Plus we should improve aggday for easier handling of turning mirrored data to +1’s.
Adam again: at deadline time, we need to make sure we run the first goal before the second goal, regardless of retries. Claim: existing autodata integrations are robust to that — by doing one last check at ≥ the deadline to add any eked data — and the meta integration would inherent that robustness.
Do we want a constraint that you can’t choose a parent goal to feed a meta goal if the parent is itself a meta goal? Default: yes, that’s the simplest way to prevent a data singularity.
Are we polling Beeminder for new data or using callbacks/triggers?
If we’re just counting datapoints, is that too dumb for some use cases, like an odometer goal where there are lots of datapoints with the same value? for the “how much data am i entering” use case it makes sense to say “we just count the actual number of datapoints; editing a datapoint isn’t a new datapoint; redundant additional datapoints ARE still new datapoints. it’s just about how many datapoints got added — that’s the metric”. for other use cases that may be dumb. like if the meta goal really means “made some amount of progress each day”. but maybe that can be the user’s problem? “we just count datapoints so make sure there’s one datapoint per day that you made progress”.
More concerns from Adam: If users get an email 20 minutes after their deadline, saying, “hey, you derailed yesterday at your deadline”, it’s OK. One example, but not the only example, is that RescueTime goals aren’t processed until 15 minutes after their deadline*, because free RescueTime takes a while to gather the data and make it available via the API. If our metagoal isn’t aware of the source goal’s fetch timing, it’s going to derail people erroneously. Both footime and footime’s meta goal are going to get poked at deadline, and footime will schedule a refresh job for 15 minutes from now, and footime_meta will… well, if it’s like “any other integration”, will schedule a refreshjob, pull data from footime, which hasn’t refreshed yet, and then 15 minutes later, footime will run, refreshing itself. That’s gonna be bogus, because the user will wake up, footime is up to date and looks fine, and footime’s meta goal is derailed, and it might not even have the data that footime has yet! If they hit refresh on footime’s meta goal, it’ll pull in the latest datapoints from footime, and the user will wonder “why did those moneygrubbers tell me I derailed when they could have just refreshed my goal? They should have known — the source goal was from Beeminder itself, for pete’s sake!!!!!!1!!!” So maybe we can insert a delay for meta goals (Nicky’s idea) or make them premium-only with warnings (Clive’s idea) or do what we do for RescueTime (Adam’s idea)? Adam also points out the distinction between delaying a deadline vs just processing a goal later than the deadline to make sure the data is synced.
If multiple goals can feed a meta goal, is that a burden for users with waterfall deadlines? Adam wrote about this in the forum years ago. Clive points out that in the MVP we can just say it’s the user’s problem to make the meta deadline after the last waterfall deadline.
The general answer to most of these open questions is (a) we finish spec’ing out the worse-is-better version that doesn’t worry about potential monkey wrenches like out of sync deadlines, (b) we soft launch the worse-is-better version and start using it ourselves and then let some dailies in so at first it’s just for power users who know that, e.g, the parent goal might not have its data in time so just be aware and have some eke buffer, (c) we get a sense of how correct Adam is about the complications, and (d) we either robust stuff up or find workarounds, etc. And as for (b), maybe we decide that this whole violation of the anti-grace-period principle needs to die and RescueTime users can suck it. If RescueTime doesn’t have your data by midnight, you’ll derail and you just need to be aware of that cuz Beeminder is always gonna do the simple predictable thing of checking the autodata source at the deadline, end of story.

Background and Philosophizing

The reason we first thought of this is that it complements auto-canceling subscriptions beautifully. We actually have had at least one complaint about auto-canceling subscriptions because auto-repeating subscriptions are a kind of commitment device to get yourself to keep using the service. Well we sure as heck have a better solution to that problem!

We even had the idea that we could create a meta goal for everyone automatically when they first sign up so you don’t start with an empty gallery. It’s an inherently interesting Quantified Self metric: how much am I actually tracking altogether? So many people are overwhelmed upon signing up; having a starter goal could really help. And you’re signing up for Beeminder because you want to commit to things so why not dive in by committing to use Beeminder itself?

Brainstorming More Use Cases

A weigh-ins goal based on a weight goal.
A meta goal for entering data on your other goals.
Mini-habits like a pushups goal and also a “did at least one pushup each day” goal. More generally, ambitious vs bare-min goals using the same data.
A fasting goal as a derivative of a calories goal.
Microcovid budgets aggregated over household members.
Beeminding expenses, beeminding income, and then also beeminding savings which is income minus expenses.
More generally, beeminding subsets of data like discretionary or household spending.
Urgency load.
Multiple specific exercise types all feeding into a general “exercise this many times per week” goal. (HT Maxwell Joslyn)
Maxwell Joslyn: I have a “minutes / day spent on goal oriented activity” incl coding, writing, game design which works well, feels good. But if I want to make a thesis specific “minutes / day” goal, then I have to enter/update all those datapoints twice. Would love to have the thesis-specific goal be able to update the more general goal.

Circumscribed Spec for Phase 1: [pithy description involving webhook callback push something]

It would be really nice if Beeminder goals could push data to other goals as soon as they get it. We don’t want to poll ourself for updates like we do with other autodata sources.

Oh hey! We already have a data-push thing built into Beeminder. How convenient.

Externally we call it “Webhook for real-time PESOS (aka data push)”. Internally it’s referred to by the field that stores the URL to push to (callback_url) and the name of the job that runs to push that data when a change happens to a goal’s datapoints (IwcRemoteStream).

(Historical aside: IWC here stands for “Indie Web Camp”, and we implemented this feature ages ago before we even had a public API, at an Indie Web Camp weekend hackathon thing. The idea is to keep a realtime backup of your Beeminder data or trigger some action on your own webserver anytime new data appears in a Beeminder goal. More in our help docs.)

So as a starting point for the full meta-integration, we are making it nice and simple to replace the simple “weigh-ins”-type IFTTT goals (or as Adam Wolf calls them “foo days”) that many of us have, without having to set up your own web server to receive those pushes. We will make a Beeminder endpoint, api/v1/users/USERNAME/webhook [nominologist needed], to receive those callbacks and then update a different Beeminder goal with new data. [was: “do appropriate stuff” — this is spelled out momentarily but need more of a hint here]. Like “hey! I[beeminder] just got a new datapoint on carol/weight, I’d[beeminder’d] better update carol/weighins”. [everything is still very confusing at this point, like who the “I” in “I’d better” is => “i’d better” is beeminder talking to itself]

Let’s walk through a use case where Carol has a weight goal, bmndr.co/carol/weight, and a weighins goal, bmndr.co/carol/weighins, that is currently fed by an IFTTT recipe. Carol would like to disintermediate IFTTT and make that weighins goal automatically be updated by Beeminder directly, whenever her weight goal is updated.

First, Carol goes to her goal settings on carol/weight and pastes the following URL into the “Webhook for real-time PESOS (aka data push)” text field in the Data section of the Settings tab:

https://www.beeminder.com/api/v1/users/carol/webhook.json?goal=weighins&auth_token=TOKEN

[NOTE: because we are pushing data FROM carol/weight TO carol/weighins, we need to set the callback url on weight, and the url needs to point TO weighins. we need some kind of “pusher” and “receiver” jargon here. and probably also a wizard to help set this up if we really want to call it “nice and simple”]

Now, when a new datapoint is added, updated, or deleted from carol/weight, Beeminder makes a POST to that webhook.json URL. The Beeminder API receives that POST like so:

POST /api/v1/users/carol/webhook

  input params: <- (this is the payload that IwcRemoteStream sends)
    user:     carol
    goal:     weighins
    action:   ADD
    source:   d.goal.shortname,
    urtext:   d.urtext,
    origin:   d.origin,
    daystamp: d.daystamp,
    created:  d.id.generation_time.to_i,
    value:    d.value,
    comment:  d.comment,
    id:       d.id.to_s

   if action == "ADD", add +1 to carol/weighins goal

And thus, when a datapoint is added to carol/weight, Beeminder updates carol/weighins, and Carol no longer needs that IFTTT recipe.

Syntactic sugar for pushing to a Beeminder goal via the existing webhook feature

# Convert a URL for a Beeminder goal into a URL for pushing to. It's like
# canonicalizing the URL. For example if you give it something like
# bmndr.co/alice/weight with token abc123, it returns 
# blahblah/api/v1/users/alice/webhook.json?goal=weight&auth_token=abc123
def pushify(url, token)
  urlprefix = 'https://www.beeminder.com/api/v1/'
  url.gsub(
    /^(?:https?:\/\/)?(?:www\.)?(?:beeminder|bmndr)\.com?\/(\w+)\/(\w+)\/?$/,
    urlprefix + 'users/\1/webhook.json?goal=\2&auth_token=' + token)
end

With that in place, we can just generalize the existing webhook feature. The UI copy will be:

Webhook (for real-time data export or the very beta meta-minding feature)

(No more mention of PESOS.)

The pushify function transforms a bmndr.co/alice/foo URL into the webhook endpoint for alice/foo and leaves any other URL alone. So pushify is idempotent and can always safely be applied to the webhook URL.

Wait, new idea to be a little less magical: you have to give the URL as bmndr.co/alice/foo/webhook and we only transform it in that case. So no one would stumble into this feature but it’s not impossibly intimidating for non-technical people.

Ok but then that’s weird that that 404s so we probably want an actual route for that and…

Cycle detection

# Return list of goals that the given goal pushes to, ie, its children for DFS.
def pushees(gol)
  # Here's where the actual code will go to get the list of goals that the given
  # goal pushes to:
  ["TODO"]
end

# Whether goal gol pushing to the given children creates a cycle in the graph.
# Technically there should be a global lock in case edges get added
# simultaneously that create a cycle but I guess for now we'll take that risk.
def cyclismic(gol, children, seen = {})
  # puts "DEBUG: goal #{gol} has children [#{children.join(', ')}]"
  seen[gol] = true
  children.each do |child|
    if seen[child] then return true end
    seen[child] = true
    return cyclismic(child, pushees(child), seen)
  end
  false
end

Open Questions:

should we use the current date when we receive this POST, or the daystamp from the pushed datapoint, when we create the new datapoint in the receiving goal (aka carol/weighins)? [default answer: use current timestamp, aka urtext be like “^ 1”]
should we make the user include their auth_token in the url? or can we get around that somehow or other?
how do we guard against loops? [currently we don’t allow anyone to set a beeminder.com url in the callback, so we’ll need to at a minimum whitelist this api-endpoint-for-handling-webhook-callbacks]. ANSWER: you do a depth-first search (DFS) from the goal (call it g) being pointed from, following the pointers. if you encounter g, stop and give an error — don’t let the user add that edge. if you finish the DFS (which is not computationally hard — it’s O(n) where n is nodes + edges) then it’s 100-emoji and you accept the edge. and of course if each goal points to at most one other goal then the DFS just amounts to following the chain.
this is not robust to changing the name of a goal, so if someone sets up their weight goal to push to a “weighin” goal, then changes the name to “weighins” the webhook would fail. [What would error messaging look like in this case? What should it look ideally?]
setup helper of some kind [prereq for putting this on the front page, or calling it an “integration”. this should probably be its own section, or an its own whole spec.]
Make it work or at least gracefully fail if you put in someone else’s goal URL.

Shortcoming

One major shortcoming of this scheme is that it only allows each goal to push to one URL, meaning it doesn’t completely replace IFTTT and various power user stuff that surely more than one beeminder user does with multiple recipes all triggered by one beeminder goal.

Future work / ideas:

we could add additional parameters that the user could set in the url for /webhook.json endpoint to, e.g., pick between adding a “+1” to the pushed goal, vs adding a datapoint with the same value.
other things that are like our IFTTT macros
change goal=GOALNAME param to a list of goalnames so that a single pusher goal can push to multiple receiver goals with one URL, [alternately, maybe allow multiple callback URLS on one goal?]