Human experimentation for fun and profit

I want to experiment on my users. How do I do it?

Yesterday I talked about creating a configuration service. We’re going to leverage that. An experiment is just a configuration rule that’s sharded among your userbase.

But is it that simple? Usually not. Let’s dive in!

Choosing a treatment

Iacta alea est

The easiest way to go is to just toss the dice.

You define your treatments and their percentages and roll 1d100. The user gets into whatever treatment corresponds to the value on the die. For instance:

function getTreatment(treatments, control) {
	var value = Math.random() * 100;
	for (var i = 0; i < treatments.length; i++) {
		value -= treatments[i].percent;
		if (value < 0) {
			return treatments[i].value;
		}
	}
	return control;
}

What's this good for? Things where you're okay with changing behavior between requests. Things where your users don't need consistency. Probably where your users won't notice a lot. Like Google's 41 shades of blue.

Introduce a discriminator

So you determined you want each user to have a consistent experience. Once they enter an experiment, they're in it until the experiment finishes. How do we do that?

The simplest way is to introduce a pivot value, something unique to the user:

function toHash(str) {
	var hash = 1;
	for (var i = 0; i < str.length; i++) {
		hash = hash * 33 + str.charCodeAt(i);
	}
	return hash;
}

function getTreatment(pivot, treatments, control) {
	var value = pivot % 100;
	for (var i = 0; i < treatments.length; i++) {
		value -= treatments[i].percent;
		if (value <= 0) {
			return treatments[i];
		}
	}
	return control;
}

config.treatment = getTreatment(toHash(user.email), treatments, control);

What's great about this? It's simple, that's pretty much it.

What's terrible about it? The same users get the first treatment in every experiment. If you want to roll out updates to 1% of your users at a time, the same person always gets the least tested, bleeding edge stuff every time. That's not so nice, and it opens you up to luck effects much more.

The victorious solution

Quite simple: instead of basing your pivot only on the user, you base it on the user and the experiment. For instance:

var experiment = 'home screen titlebar style - 2016-06-12';
var pivot = toHash(user.email + experiment);
config.treatment = getTreatment(pivot, treatments, control);

This effectively randomizes your position between experiments but keeps it consistent for each experiment. We'll have to adjust the API to make it easier and more obvious how to do the right thing:

function getTreatment(userId, experimentId, treatments, control) { ... }

Dependencies

You will often have several simultaneous experiments. Sometimes you'll need a person to be enrolled in one specific experimental treatment for another experiment to even make sense. How do we do this?

First off, we'll adjust our treatment API so that, instead of an array of treatments, you send a JS object:

var homeScreenTreatments = {
	control: {value: {bgColor: 'black', fontSize: 10, bold: true}},
	t1: {value: {bgColor: 'black', fontSize: 12, bold: false}},
	t2: {value: {bgColor: 'gray', fontSize: 10, bold: true}}
};

Next, we'll stash our treatment decisions in the framework (in a new cache for each script run). Then we'll let you query that later. For instance:

var homeScreenExp = 'home screen titlebar style';
config.homeScreen = getTreatment(
	user.email,	homeScreenExp,	homeScreenTreatments);
// 50 lines later...
if (hasTreatment(homeScreenExp, 't2')) {
	config.fullNightModeEnabled = false;
}

We can alternatively bake experiments into the rule infrastructure, for instance, where a rule can specify a config section it supplies, treatments, and percentages. This will end up with a complex UI that does 90% of what users need in an inflexible way, but that's going to be troublesome.

However, what we want to do is store a collection of experimental treatments on the config object. We'll get into that later, but it looks like:

config.experiments = {
	'home screen titlebar style': 't2',
	'wake up message': 't5'
};

Incremental releases

Another common thing people want to do is roll out new features gradually. Sometimes I want to roll it out to fixed percentages of my users at fixed times. One option is to introduce a "rule series", which is a collection of rules, each with a start and end date. No two rules are allowed to overlap.

So I set up a rule series "roll-out-voice-search" with a simple set of rules:

// in the UI, I set this rule to be effective 2016-06-10 to 2016-06-15
config.voiceSearchEnabled = getTreatment(
	user.email,
	'roll-out-voice-search',
	{
		control: {value: false},
		enabled: {value: true, percent: 1}
	});

And I make a couple more rules, for 10%, 50%, and 100%, effective in adjacent date ranges.

But this is a common pattern. So we can simplify it:

config.voiceSearchEnabled = gradualRollout({
	user: user.email,
	rollout: 'roll-out-voice-search',
	start: '2016-06-10',
	finish: '2016-06-25',
	enabled: {value: true},
	disabled: {value: false}
});

And we can very easily interpret that to a linear rollout over the course of fifteen days based on the user's email address.

Metrics

You don't just assign experiment treatments to people and forget about it. You want to track things. And that means the client needs to know your entire configuration. But the entire configuration is sometimes obtuse to work with. So you want to see experimental treatments directly, by name, not as a bunch of configuration values that you have to backtrack into an actual value.

Separately, you need a system to record client events, and you submit the experiment treatments to it as tags. Then you can correlate treatments to behavior.

Speed

One complaint you might have is that this approach always fires every rule in sequence, and that's slow. The Rete algorithm is used in a wide variety of rule engines and is faster than naive reevaluation, so we should use that here, right?

Wrong. The Rete algorithm is complex and requires us to build up a large data structure. That data structure is used when a small portion of the input changes, letting me avoid recalculating the whole result.

In my case, I'm getting a series of new configurations, and each one is unrelated to the last. I might get a call for one collection of rules and then not get a call for it in the next hour. Or a rule might throw an error and leave the Rete data structure in an invalid state. Or I might have to abort processing, again leaving the data structure in an invalid state.

Future directions

The main target here is to look at what people are doing and try to provide more convenient ways of doing it.

We also want to provide the ability to mark portions of metadata as private information, to be redacted from our logs.

IP geolocation would be handy, allowing us to tell people where the client is located rather than relying on the client to self-report. We can grab a country-level GeoIP database for $25/month, city-level for $100/month. This would be strictly opt-in, possibly with an additional fee.

Finally, we have to turn this into a proper service. Slap a REST API in front of it, add in HMAC authentication and API usage reporting, service health metrics, and load balancers.

That concludes our short on creating an experiment system.

Configuration as a service

I’m working on a rule engine targeted toward configuration-as-a-service and experiment configuration. Since it’s nontrivial and not much exists in this space, I thought I’d talk about it here for a bit.

Configuration as a service? Huh?

There are a few things this can be used for.

Recall when Google wanted to test out 41 different shades of blue for search result links? They used an experiment system to enroll randomized segments of the userbase into each treatment. That’s one use case we want to support.

Let’s say I’m implementing a phone app and it’s got a new feature that I want to get out as soon as possible. I need to QA it on each device, but I’m pretty sure it’ll just work. So I ship my update, but I keep the feature off by default. Then I add a rule to my configuration service to turn it on for the devices I’ve QA’ed it on. As I finish QA on a given device, I update the rule to turn the feature on for that device.

Or maybe I need to take legal steps in order to provide a feature in a given country. The client sends its location, and I’ve added rules to determine if that location is one where I can legally enable that feature. It might also include, for instance, which of my API endpoints it should use to store any server-side data — some countries require user data to remain in EU borders.

What are we implementing?

We want to offer a multitenant service so that you can pay us a bit of money and get our glorious configuration service.

You will submit JSON metadata to us and get JSON configuration back. You will enter in rules in a UI; we’ll execute those rules against the metadata to get your configuration. The rule UI will let you say: this rule comes into effect on this date, stops on that date; it’s got this priority; let’s test it against this sample configuration… Not too complex, but some complexity, because real people need it.

There are two basic parts: first, a service to execute rules; then, a website to manage rules. In between we have a rule engine.

Any significant caveats?

We’re running a configuration / experimentation service. We want third parties to use it. That means security.

We need to prevent you from calling System.exit() in the middle of your rules and bringing down our service. All that normal, lovely sandboxing stuff. Timeouts, too.

Also, you’re updating your rules pretty frequently. We need to be able to reload them on the fly.

Rules are code, and code can have bugs. We’ll have to watch for thrown exceptions and report them.

What’s already out there?

Drools

The heavy hitter, Drools has been around since the dinosaurs roamed the earth. It’s not easy to work with. It takes way too much code to initialize it, and most of thath code is creating sessions and factories and builders and containers that have no discernable purpose. If you try to read the code to figure out what it all means, prepare for disappointment: it’s a snarl of interfaces and fields set via dependency injection and implementations in separate repositories.

Drools rules accept Java objects and produce output by mutating their inputs. That means I need a real Java class for input and another for output. Their rule workbench lets you create your Java classes, but that means you need to publish your project to Maven. And loading multiple versions of a rule is an exercise in pain.

On the plus side, it gives you a rule workbench out of the box, and it has a reasonable security story. However, it doesn’t have any way to limit execution time that I’ve found, meaning you have to run rules in a separate thread and kill them if they take too long. This isn’t nice.

Easy Rules

The new kid on the block, it uses Java as a rule language, which brings us to JAR hell like Drools. Unfortunately, it doesn’t supply a workbench, it doesn’t offer a way to provide inputs and retrieve outputs, and it doesn’t have any sandboxing or time limits. At least the code is relatively straightforward to navigate.

Everyone else

OpenRules is based on Excel. Let’s not go there.

N-Cube uses Groovy as a DSL, which implies compiling to a JAR. It’s also got almost no documentation.

There are several others that haven’t been updated since 2008.

So they all suck?

No. They’re built for people who want to deploy a set of rules for their application within their application. They’re for people who trust the people writing business rules. We are building a service whose sole purpose is to supply a rule engine, where untrusted people are executing code.

When you are building a service specifically for one task, you shouldn’t be surprised when off-the-shelf components don’t cut it.

When you are building a multitenant service, libraries performing similar tasks often fall short of your needs.

What do we do?

The core thing that our service does is run user code. Let’s bring in a scripting engine. And since we’re going to accept JSON and emit JSON, let’s use a language that makes that natural. Let’s use Javascript.

The Rhino scripting engine makes it easy to run code and easy to filter which classes a script is allowed to use. Let’s just use that. Now we accept a rule from a user, wrap it in a light bit of code, and run it:

// we inject inputString as the raw json string
var input = JSON.parse(inputString);
var output = {};
// insert user code here

When we want to run it, we can just write:

Context ctx = Context.enter();
ctx.setClassShutter(name -> {
	// forbid it from accessing any java objects
	// (as a practical matter, I probably want to allow a JsonObject implementation)
	return false;
});
if (rule.compiledScript == null) {
	compile(rule);
}
Scriptable scope = ctx.initStandardObjects();
scope.put("inputString", scope, Context.toObject(inputString, scope));
rule.compiledScript.exec(ctx, scope);
response.write(scope.get("output", scope));

That’s not the whole story — we want to limit the amount of time it has to finish executing, set up logging and helper functions, all that jazz. We need to locate the rule somehow. We probably have multiple rules to run, and we have to propagate partial output objects between them (or merge them after). We also have to determine what order they should run in.

But, for what this does, it’s maybe half as much code as Drools takes.

What’s so much better about your approach?

The first huge advantage is that I’m using a scripting engine, one that doesn’t shove a bunch of classes into the global classloader. That means I can update everything on the fly. I’d get the same if I made Drools talk JSON, but that’s harder than writing my own engine.

Compared to Drools or EasyRules, I don’t have to maintain a build server and figure out how to build and package a java project I generate for each rule. I just shove some text into a database.

Javascript handles JSON objects quite well, which means not having to create a Java class for every input and output. That is the largest part of savings — Drools would be acceptable if it could talk JSON.

The people writing these rules are likely to be developers, not managers or analysts. They probably know Javascript, or can fake it pretty well.

What’s the catch?

Drools is huge and complex for three reasons.

First, it had significant development going on in an age when huge complex coding was de rigeur in Java.

Second, it had a separation between API and implementation enforced for historical and practical reasons.

And third, it solves complex problems.

You want your rules to just work. Drools has a lot of thought behind it to determine what “just working” should look like and make sure it happens. We haven’t put in that thought. I think the naive approach is pretty close to the intuitive result, but I haven’t verified that.

The rules accept and generate JSON. This means you lose type safety. On the other hand, the API accepts and generates JSON anyway, so this is pushing things a step further. Not great, but not the end of the world.

Javascript is kind of ugly, and we’re promoting its use. It’s going to be a bit crufty and verbose at times. The point of business rules in the Drools language or what-not is so that managers can read the rules, and we’re kind of missing that.

What do these rules look like?

An example rule:

if (input.device.name == 'bacon') {
	output.message = 'Congrats on your OnePlus One!';
}
if (input.device.name == 'bullhead') {
	output.message = 'Congrats on your Nexus 5X!';
}
if (input.device.uptime > 31 * 24 * 60 * 60) {
	output.sideMessage = "It's been a month. You might want to reboot your phone.";
}
output.homeScreenTreatment = Treatments.choose(
	'homeScreenTreatment',
	input.userId,
	{
		control:  {value: {backgroundColor: 'black'}},
		grayBg:   {percent: 5, value: {backgroundColor: 'gray'}},
		grayBold: {percent: 5, value: {backgroundColor: 'gray', bold: true}}
	}
);

I’ll talk a bit more about the experiment side next time.

Capitalism and consent

I’ve heard radical feminists talking about how having sex for money is rape, because the exchange of money prevents free consent under capitalism. And I’ve heard a response that, since you can refuse an offer to trade money for sex, it’s still consent.

We’ve determined as a society that lesser forms of pressure applied to a person are wrong. Threats of violence and blackmail are both illegal ways to get someone to act a certain way. If I blackmailed someone into working a particular job, that’s wrong. I don’t have to physically take hold of a person and move their limbs through the appropriate motions.

Threatening someone with death is wrong. Threatening to take away someone’s life-saving medication is wrong. Threatening to throw someone out in the wilderness in light clothing in Alaska in January is wrong. Threatening to take away the only food someone can obtain is wrong.

Unless you have a capitalist excuse.

It’s fine to evict someone from their apartment for not paying their rent. It’s not fine to throw someone out of the house they own. These acts are effectively the same, modulo capitalism. But our society says that the first is fine and the second is wrong. This is bizarre!

But we as a society have determined that that’s the way things will be, and that’s good and right, even if it kills people. Compelling someone with economic threats in an approved capitalist manner is perfectly fine. This isn’t right and it isn’t good, but that’s the society we’ve built.

The short of it is, society sets up a situation in which you must get money or die, and I can take advantage of your plight.

(Sex work is no different from any other sort of work this way. Some people say that sex work is degrading, but I’d rather have sex for money half a dozen times than spin a sign in the streets for a whole month.)

Since it’s society setting up that cruddy situation and individuals deciding whether to contribute to your continued survival, there’s a diffusion of responsibility problem. It’s very hard for me to give up money that goes to my own continued survival to help you with your continued survival. It’s much easier to push for society to be rearranged among gentler lines. (This would have to be a holistic change. There’s no sense going after one profession in particular while ignoring the rest.) How might we do that?

Welfare allows you to suffer but not quite to the point of death, assuming you can prove to a bureaucrat that your suffering is sufficiently abject.

Charity is much like welfare. It’s hard to find an appropriate charity to help you. What sort of aid you get is up to a body of do-gooders who are typically convinced that you’re going to drink away any cash you get (and that that’s a bad thing and their responsibility to stop).

A minimum wage requires you to sell your body but at least guarantees that you’ll get a certain efficiency from the deal.

Basic income and guaranteed minimum income free you from having to sell yourself if your needs are modest. You don’t even have to prove that you are suffering to get help. But people are afraid that fewer people would work and it would cost them some of their treasured amenities. Or that it would take greater and greater monetary incentives to get people to work the jobs required to let everyone live — that there would be famines due to a lack of farm workers, for instance. Guaranteed minimum income is slightly worse in this regard: an employer must pay a lot in order for me to improve my lot by working.

Socialism gets rid of this problem entirely. In exchange, you may be conscripted to help your society when you otherwise would not want to — just like a draft.

Like those radical feminists, I think the economic exchange can take away part of your ability to consent freely — as long as you need to work to make money to survive. The problem is not sex work, any more than we should single out window washers or automotive manufacturing workers as a profession to “save”. The problem is that capitalism determines who deserves to live. I’m not going to label sex work as rape, any more than I’m going to label having a job as slavery, but capitalism has far too much power, and we need to fix that.

For the record, I favor basic income over socialism. Planned economies remove some inefficiencies, but free markets are very good at routing goods to those who want them.

Wage gap and controls

The United States has a strong male/female wage gap. Women make roughly 82% of what men make (among full time non-seasonal employees). An ideal wage gap would be one small enough not to be statistically significant.

When you control for factors like education, experience, and field, the wage gap drops significantly. Huzzah, job done, no more inequality! Yes?

No.

Discrimination in hiring practices between different fields with different pay grades will produce wage inequality. In fact, there’s evidence that women joining a field in numbers reduces wages in that field.

Discrimination in education will reduce women’s ability to get jobs in fields that require education.

Discrimination in marketing jobs or the hiring process will reduce the likelihood that women will choose jobs in a field. For instance, if it’s common to meet over drinks when interviewing for a software development job, that’s effectively discrimination. The CDC is recommending that all women who are not on birth control eschew alcohol, for instance. Women are generally instructed to avoid alcohol in order to avoid rape. So try to avoid introducing alcohol in your hiring process.

Discrimination in parental leave and expectations of child rearing will result in women having fewer years of experience at a given age. This is even more significant because women are more likely to be in fields with low wages, so it makes more sense for women rather than men in heterosexual marriages to give up their jobs.

Discrimination in who is expected to be a primary caregiver outside of parental leave has another impact. Primary caregivers must be available to tend to sick children. They must have schedules that match their kids’ school schedules. They must have flexible schedules for when their children have a half day at school. This restricts the type of job that caregivers can hold and what companies they can work for. This sort of restriction competes with optimizing for wages, which, in a capitalist society, is the duty of every worker. This is bad enough in a heterosexual marriage, where it would be in theory possible to share work evenly between men and women. But over a quarter of children live with only one parent, and more than three fourths of single parents are women.

These are all real and important forms of inequality that people remove from their calculations when trying to derive a “real” wage gap from the raw numbers.

It’s sensible, for certain reasons, to control for these factors. If you’re only looking for a certain type of unfair compensation practice that’s common across many fields, this can be useful. If you’re deciding whether to stand on the deck of an aircraft carrier and declare “Mission Accomplished”, you should look at the raw numbers. And if you want to know where inequality is cropping up, you need to slice and dice the numbers six ways from Sunday instead of trying to control for a handful of traits. What’s clear is that we still have work to do.

Google Play Movies & TV UI review

I bought a season of a television show on Google Play and am watching it on PC, via the web interface. And it’s not great.

I want to watch the series in order, naturally. So of course the UI supports that; I can start watching episode 1, and when it finishes —

Oh, it doesn’t actually advance me to episode 2 at that point. It sticks a loading animation on top of the final frame and does nothing. I have to close the playback screen and manually select the next episode.

Well, that’s okay; at least it tells me which episodes I’ve watched, right? Well, turns out that’s a mobile-only feature. The web interface doesn’t show that information.

If I start watching an episode, though, I can at least see the title of it in order to find the next episode in the list, right? Well, actually, turns out that doesn’t work. I just finished watching episode 10, and it’s got the info for episode 7 up. (At least, I think it was episode 10. It might have been 9. Don’t know until I try it!)

Okay, that’s bad, but at least it’s easy to navigate to the shows I’ve purchased, right? It actually isn’t that hard — I’d prefer it if it showed those to you by default and let you go to a store page with an extra click, rather than showing you the store and then giving the option to look at just what you can watch right away. But in my case, I’ve purchased season 1 of the show, and when I go to “My movies & TV” and click on the show, it takes me to season 2, which I don’t even own.

I can watch stuff successfully, and that’s even on Linux, so bonus points for that. And I can watch on my Android tablet — without even installing a third-party app store. The Android version even has a little progress indicator to tell you how much of an episode you’ve watched, which is pretty handy the first time you watch something. (Can you reset those progress indicators? I doubt it. So it’s only good for the first binge.)

What do I want?

  • Let me get to my library fast. Two clicks is the absolute maximum.
  • When I go to a television show, if I came there from my library, show me the last season I’ve watched part of.
  • Include next/previous links for seasons. There’s currently a dropdown, which, if I’m in season 4 of Stargate, isn’t so handy.
  • Always, always show me where I am. Highlight the episode I’m watching in the episode list, and show the summary of the episode when I’m done watching it (instead of showing another episode’s summary).
  • Always show which episode I’ve most recently watched, if I’m not watching something already.
  • Maybe list when you’ve last watched each episode.

It’s not that hard. You use the product for two hours and these problems become obvious.

Now, I’m not faulting Google engineers here. Even people of Google quality could implement these changes within a few months. It’s product management that’s at fault, an emphasis on whitespace and a lack of UI elements over functionality. I hope they get over it eventually.

What else is there to the UI? Your main page on Movies & TV has a list of movies followed by a list of television episodes. The episodes are categorized by series, which is good. They’re ordered by purchase date, I believe. If you buy a whole season at once, though, they’re ordered from last episode to first. That…is stupid.

When you’re looking at a single season, you see each episode as a card, and the cards are shown in a grid four wide. A list would be easier to use — you’ve just got one dimension to worry about, and that gives you more width for titles, which are usually cut off. (The cards give you about 15 characters, so I can see titles such as “Auld Acquaint” and “Hurricane Fluff”.) If you used a list, you could also accordion out each episode.

There’s a little red checkmark on episode cards for episodes you own. On the mobile app, there’s also a red checkmark — but this indicates whether you’ve downloaded the item, not whether you own it. It’s not self-explanatory either way.

I think the key takeaway is that people who design these things must be forced to use them if you want anything vaguely usable. But maybe they did use it and still came up with this. Not sure. And maybe I should become a UI usability expert.

Pacing and development in DC comics

I recently picked up a copy of the first few volumes of Supergirl (2011) and Blackest Night. They’ve all got the same problem: they’re outrageously rushed.

The amount of dialogue and the number of scenes in two hundred pages of comic (which is about average for Supergirl) is roughly even with 1-2 hours of a television show. A volume is expected to comprise one story arc. It introduces new characters, creates a conflict, and then resolves the conflict.

In a television show, a season might comprise one story arc and several filler episodes. This lets you tell a more involved story. It lets you cover more plot and show characters in more depth. This is a good thing in general. If you don’t have that much space to tell more story, you can work with it — you introduce fewer characters, make the conflicts more immediate, and tell smaller stories. But you have to work with it.

In both Supergirl and Blackest Night, the authors didn’t. They made no concessions to the medium. It’s a problem. I think they were trying to map one issue to a television episode, roughly, but they only have time for three to four scenes per issue, while even a half hour television show can cover a lot more ground.

Take Supergirl. Kara Zor’El finds a human who can speak Kryptonian: Siobhan. Instant trust. They’ll risk their lives for each other immediately. On television, we’d see several scenes of them interacting and building a friendship first. Later, Kara finds a character named H’El, whom she trusts deeply and implicitly after exchanging a few words. In a television series, there would be at least a little interaction between them, giving time for that trust to build.

This is bad. It’s unrealistic and paints characters as hopelessly naive and idealistic. At best, readers will assume that the character building and essential interaction is happening in some other series featuring the same characters. Is it? I doubt it, and I’d have to consult several wikis and spend $50 to check.

Blackest Night, on the other hand, relies on several years’ worth of comics to set the stage. It’s clear that a large number of heroes died, but if this were the opening to a television season, we’d have something to remind us how they died. (It’s perhaps unfair to compare New 52 titles to previous works, granted.) It introduces its antagonists as a primal force rather than characters and does nothing to show their motivations. Their origins are given very little time; you might as well describe Alan Scott’s origin story by saying that, in the beginning of the universe, the white light of life separated into the emotional spectrum, and Scott received an artifact that allowed him to use the green light of will.

Actually, that’s a more adequate and thorough origin story than we’re given for the antagonists in Blackest Night.

The result is a lot of confusion. In this case, since we’re starting with characters with a lot of history behind them, we don’t have the same problem with a lack of characterization — except for one Heel Face Turn and the primary antagonist — but we do have a horribly rushed main conflict.

This is bad storytelling.

It’s two types of bad storytelling in two different series, but it’s motivated by the same thing: reducing the number of pages to explore a given conflict. Because artwork is expensive to produce and expensive to print. And that is simply unfortunate.

Police, huh! What are they good for?

If we just disbanded the NSA on the grounds that it’s overstepped its bounds and isn’t providing sufficient value, I doubt we’d notice. But the police are more integrated with society. They handle issues that are more likely to affect us.

Police in the US have started killing people for their skin color. They rape sex workers as if it’s their job — and according to their police chiefs, it is. They are explicitly told to make drug arrests in predominantly POC neighborhoods — primarily black, also Hispanic. They’re subject to no effective oversight. The public in the US has the patience and attention to see one or two officers tarred and feathered per year, but the primary effect is to force the officer in question to move to another county and get another job on another police force.

To be clear, not every officer is killing black people. Not every officer is raping sex workers. Not every officer is targeting black kids for drug arrests. But it’s easy to find a department to support you after you’ve murdered an unarmed black man. Every officer who is assigned to vice is raping sex workers or actively assisting with the process. Police departments are assigning cops quotas and patrols specifically to ensure that it’s mainly black kids being arrested for recreational drug possession.

The problems are too widespread, and many of them are baked into the political structures of policing. At this point, it would be a herculean task (or possibly sisyphean) to root out the corruption and outright evil in police departments. It’s much more efficient, much cleaner, to tear them to the ground and replace them with something else. Something designed to protect people from itself above all.

So what are police good for, and what should they be good for, that we want a replacement to handle?

Help people obey the law

The police as is don’t help people to obey the law. They will inform you that you are violating the law and process you so your punishment can be assigned. In routine cases, this means they will enact a fine. In less routine cases, they will transport you to a jail to be reviewed by a judge, possibly with jury involvement.

Ideally, we will provide people to help you obey the law.

Traffic safety provides a lot of tickets. Part of that is vehicle maintenance. If you have a busted tail light, a police officer will pull you over and talk to you about it, tell you to get it fixed, maybe issue a fine. How about hiring traveling auto mechanics instead? You drive around and see someone has a broken tail light, so you pull them over and ask if you can fix it for them. You see that someone has an expired inspection sticker, so you schedule an inspection for them — and you can even offer to drive the vehicle over if they’re having trouble getting to the shop.

You need licenses to do a variety of things. Selling certain items, for instance. A police officer would close down your store, issue a citation, maybe arrest you, depending. We want people who will instead walk you through the process of getting that license. We want roving ombudspeople.

Resolving disputes

Police are sometimes called in, in places where they’re still respected (or at least feared less than someone who’s presenting a more immediate threat), to resolve disputes between people. The median required training duration for conflict resolution among police schools that require it is eight hours. One in six schools don’t even require it.

We need mediators.

We need neutral mediators who are not going to threaten to send you to jail for calling them to help you get away from an abuser. (It’s becoming increasingly common for police responding to domestic violence calls to simply throw all adult parties in jail.) The mediators need to be able to bring abuse victims to shelters and remove abusers from homes. For situations that aren’t cut and dried, they need conflict resolution training, and probably more than a one-day seminar packed in the middle of combat training.

Since these cases can turn violent, these mediators have to be able to avoid injury. They might need defense training beyond avoiding injury. I don’t know. It’s a bad sign if that sort of training is often called for; it means the mediation portion of the training isn’t sufficient or isn’t right. Maybe we can start out with people working in pairs: a mediator who goes in first and a bouncer as backup who stays outside unless called for.

Investigating crime

Crime will still happen. We need people with the necessary training and equipment to track down criminals in cases where tracking them down is important. Beat cops don’t help much here. It’s detectives and crime scene analysis people and the FBI. We could incorporate it all into the FBI. However, that might simply turn the FBI into today’s cops.

This is a hard problem. I’m still looking for options.

Stopping crimes in progress

Police attempt to stop crimes as they happen. In the case of violent crime from a determined perpetrator, that requires equipment and training to enact violence on people. Insofar as such people are needed, we should not conflate their roles with the other roles that cops are or should be fulfilling today.

Preventing crime

Preventing crime is not a goal for police.

Crime prevention is difficult to attribute to individual officers. If a district patrolled by twenty cops sees a 5% reduction in crime year-over-year, that might net a bonus to the officers who patrol there. But nobody gets to police chief by seeing a reduction in crime. They get promotions for more measurable and attributable actions.

They get rewarded for sending people to prison.

That’s the opposite of what we want. We want fewer criminals. We should rejoice if the NYPD didn’t find anyone to arrest for a week. But our incentives are perverse.

Crime prevention in general is a multifaceted thing, and a police force or the equivalent isn’t going to stop all crime. But there are some obvious things we can do to reduce violence.

Ban automatic and semiautomatic firearms, institute a buyback program, and we would likely see a strong reduction in firearm-related suicide and homicide. (Australia saw a 70% reduction in firearm-related suicides following their buyback program. Their homicide rate was insufficient to see statistically significant reductions, even with a 50% drop, because they just don’t have that much murder.)

Improve education. There’s correlation between dropping out of high school and committing crime. The Alliance for Education estimates that a 5% improvement in education rates would drop felonies by over a hundred thousand per year, at an overall savings to states on the order of $18 billion.

Reduce poverty. There’s a strong link between poverty and likelihood to commit crime. Education helps somewhat, but as we move to more automation (which is good!) and as our outsourcing trends continue, we will see more widespread poverty as jobs disappear. In January, for instance, there were 160,000 Uber drivers. In ten years, when driverless cars become widespread, Uber’s employment rates will plummet, their operating overhead will drop, and we’ll see more concentration of wealth. This doesn’t help to reduce poverty. There are 3.5 million truck drivers. Highway driving is a simpler problem to solve than city driving, so in ten years we’ll start seeing a lot of automated long-haul trucks. That will drop decent wage jobs that tend to support small towns throughout the country.

Institute consent education in all schools, from preschool to grad school.

Traffic safety

Cops enforce traffic laws. Or rather, they turn breaking traffic laws into a lottery.

Driverless cars will follow traffic laws. Everyone tends to follow a compromise between safety, speed, and obeying laws. The people who tend to speed a lot also tend to buy cop detecting devices.

Regardless, traffic violations are a short-to-medium term consideration. In fifty years, driving might be illegal — or merely accompanied by so large an insurance premium as to make it unjustifiable for most people. And even today, traffic safety isn’t a police objective; traffic patrols are primarily a revenue source. Perverse incentives again. We can probably just eliminate this function entirely.

Medical first response

Police get a modicum of medical training — the median is about three days’ worth. It’s valuable to have more people in a community who have medical training and can be called on quickly to deal with medical emergencies.

I have wilderness first responder training. Everyone should have wilderness first responder training. It should be a required part of high school. We should encourage more people to become EMTs. We should have a Medical National Reserve like the Army Reserve. People who get EMT training and equipment, and they’ll occasionally be called on to handle medical emergencies nearby, but they can have other jobs.

Or we just hire more EMTs.

We could even have a draft, if necessary. I’d be much happier about drafting people as EMTs than as soldiers. It’s a high-stress job and physically demanding, so shorter terms would be better.

Enabling more complex laws

With a dedicated law enforcement staff that punishes people for violating laws, you can have more laws and more complex laws.

Ordinary citizens must be able to understand the law. They should be able to predict as much as possible and easily memorize the rest. It is a bad thing to have complex laws. But insofar as complex laws are necessary, we will employ specialist ombudspeople.

Ingesting people into prisons for cheap labor

In the United States, slavery is legal as a punishment for crimes. This is mediated through prisons, many of which are for-profit companies. Most prisoners do get paid for their work, but the rates are usurious — it’s unheard of to make even half of minimum wage as a prisoner.

The police serve as a means to induct people into prisons, largely via the “war on drugs”. They choose people who are unlikely to have resources — or relatives with resources — to object or fight back. People with zero political clout.

Prisons should not be slave barracks. Some few people need to be kept apart from the rest of the population; prisons should hold them securely and humanely. Some people commit crimes and need to be taught so they will not do so again.

Slavery could be used as a deterrent, and we could debate the effectiveness. But we’re hiding the fact that we’re using slavery, so we’re not even doing that.

We will not continue to use slavery. We will instead halt the war on drugs and try to decriminalize the nonviolent activities that are used to promote slavery.

How do we get there?

This is, unfortunately, a radical undertaking. We need it, but it’s a collection of sweeping changes, it requires us to allocate a ton of money, train a bunch of people — even getting qualified teachers will be difficult — and then we end up with a lot of former police officers who are disgruntled, probably have firearms, and probably want to prove to America that traditional policing is necessary and we’ll have anarchy without it.

One early step is a firearm buyback program and firearm restrictions. Then we need to reduce the police force’s armaments. We also need to reduce poverty in a significant way — a basic income would reduce crime and, with it, the need for police. We also need to end the war on drugs as soon as possible, forbidding for-profit prisons at the same time.

Unfortunately, legal simplifications will be outrageously difficult to enact. Laws aren’t passed just to make lives difficult; they each have their own reasons and histories. And the transition from a paid lawyer model to a public ombuds model will not be particularly well received.

We can make large inroads on the number of police and police power. It’s within our reach. It just won’t be easy.

Why I’m not in the D community

D is a great programming language in many ways. It’s got a host of features to make your life easier. It’s got syntax that’s familiar to anyone who knows Java, which is almost every programmer these days. It does away with a lot of cruft by making the syntax lighter and by making reasonable assumptions for you.

On the library front, they took the concept behind LINQ and kicked it up to eleven. It’s pretty awesome overall. There’s a working coroutine implementation, and it’s pretty efficient, plus you can subclass the Fiber class and provide your own scheduler. The standard library is mostly okay, missing some things you’d expect it to have. There’s a package manager, but it’s pretty new. There’s no corporate support for anything, though — no AWS client, no Google API client, no first-party datastore drivers, nothing. So get used to writing your own stuff.

Still, on the whole, it’s a reasonable option for some use cases, and I’ve been working off and on to create a MUD in D.

But I’m leaving the newsgroup, I’m not going to report any bugs, and I’m staying off the IRC channel. And I’m probably never going back.

Why? Because D’s community is garbage.

If you want a programming language to gain adoption, you need to make it friendly to novices. You need to make it easy to learn. You need a standard library with good documentation. You don’t have to change the features that your language exposes, necessarily, but you do need to provide the resources people need in order to start using the language.

Hardly a day goes by without people on the newsgroup expressing or implying a strange sort of pride in how obtuse D is to learn, or how the documentation isn’t easy to understand quickly. When people point out problems, there is always someone eager to pipe up that it isn’t a problem because they managed to learn it, or it’s okay that something is presented in entirely the wrong way because the data that’s shown is data that needs to be available.

Say something needs to be improved and people will derisively ask “Where’s your pull request?”

This isn’t a good attitude to have.

To be clear, this isn’t everyone. It’s maybe one in ten. Walter and Andrei, most importantly, don’t do this. But they do nothing to stop it.

So I will use D, when it’s appropriate. I will even release open source projects in D. But I won’t join in the wider community.

Gendered language in examples

Language is often gendered. It’s absurdly gendered. We’re eroding that a little in some places, but that’s incredibly slow.

One suggestion I’ve heard regarding gender neutral language is that authors (for instance, of philosophical articles) use their own pronouns and gender when referring to hypothetical people. This seems fair on the face of it, yes? It’s a simple rule, too.

Problem. Many fields are male-dominated. If we add that rule for fairness, then the literature will feel just as male dominated as the collection of authors. It explicitly extends the unfairness in the field to the literature under the guise of fairness.

Instead, let authors write about people of a gender that is not their own. It’s not nearly as useful to have good representation in a field’s literature than to have good representation in the authorship in that field, of course. However, it will at least get men used to hearing about women and thinking of them. It would be a tiny thing to help reduce the amount of sexism in the field. Hopefully.

In reality, this would rather easily identify an author’s gender, which is undesirable in a number of situations. A reasonably anonymous policy would use something unrelated — for instance, the entire field might rotate between masculine, feminine, and agender / genderqueer examples on an annual basis.

Anyway, this is a tiny proposal that has no way of getting any traction, but whatevs.

A Return to Go

I’ve switched jobs and am using even more Go. While I previously talked about Go, it was a while ago, and I was using it inside Google, with a vastly different build system than is inflicted on the wild. So I have a new perspective on it, and I’m updating my opinion.

Concurrency

Concurrency is why you write code in Go rather than any other language, right? It’s Go’s shining feature. Aside from goroutines, you’re pretty much left with C with a facelift and garbage collection, and that’s probably not the thing you most want for application or service development.

Go’s not concurrent. It’s as concurrent as Node.js, just you don’t need to structure your code with callbacks. That’s it.

In Java, I have true concurrency. I’m not saying this because of OS threads versus cooperative multitasking. No, the problem with Go’s concurrency is that it’s entirely hidden from you. Java actually gives you thread objects. It lets you cancel the execution of a thread and check on its status. And that’s what I need, much of the time.

In my spare time, I’m writing a MUD. This involves a ton of AI routines, scripts, and user input handlers, each of which is easier to write as a sequential operation. So I want some sort of multitasking, and since I’m estimating a huge MUD world might need over half a million threads, OS threads won’t do. Can I use goroutines?

No.

I need a reliable system. That means a process that checks over each scripted object to ensure it’s actually running its script. How do I do that in Go? …well, I have zero access to the default scheduler, so I can’t ask it for a list of running goroutines. I can’t get a goroutine object and ask it whether it’s running. I could use a waitgroup for each scripted object and deferred execution so that when that object’s goroutine exits for whatever reason I can see it, which is slightly annoying and has to be repeated everywhere.

I need actions to happen on schedule. I can handle the whole MUD being 50ms slow for one scheduling tick (which is planned to be 250ms); I could handle one script being slightly late for one tick; but I can’t handle long-term clock drift. Also, the order of execution might be dictated by game rules — players always go first, ordered by their speed stat; NPCs go second, ordered similarly; items and rooms go last in arbitrary order. This is much easier to handle if I write my own scheduler.

I need to suspend tasks. I wrote a script for an NPC sailor to wander around, singing and quaffing ale, but halfway through, a player attacks her. I need to be able to suspend this singing and quaffing task to handle combat. In Go’s world, I need to check whether the NPC is currently in combat after every yield. This is unusable.

What language does this stuff right? Well, the best I’ve seen is D. Fibers in D are much more of an afterthought than in Go, yet in D I can write my own scheduler, get a reference to a Fiber object to check on its status, and even cancel further execution of a fiber all in the standard library.

What if you’re stuck with Java? Well, most of the time, you aren’t manipulating shared state anyway. You need to ensure that your database library is threadsafe or just instantiate a new adapter with each task, and probably similarly with a couple other things, but you can pretty much ignore the fact that things are running in multiple threads and be okay 95% of the time. Just use a threaded ExecutorService and be done.

Type system

I thought the type system was a bit anemic before. Now I view it as an enemy.

Interfaces are not met by accident. They are planned. People write code to match an interface. Go doesn’t realize that. There is no way to say to the language: here’s this type, and oh by the way, just ensure for me that it matches the io.Reader interface. So you get compilation errors at call sites because the type doesn’t match the interface you designed it to match. This is the opposite of what I want.

There is no virtual dispatch. People do not use interfaces by default. They use concrete types by default. This means testing is ugly. I end up having to write interfaces for other people’s code a lot.

Covariant return types are not allowed. Interfaces operate on exact match only. I wrote an interface for a Redis client for testing. Then I realized that I couldn’t instantiate the return type for one of the methods with sensible values — it had private fields with public getters. So I had to write a wrapper struct for the Redis client that simply forwarded the relevant method but had a slightly different return type. (It actually isn’t possible, given the language, to solve this problem in a sensible manner. That doesn’t mean it’s less painful or that Rob Pike is any less at fault; it just means he messed up earlier and it only became apparent here.)

Syntax and parsing

Go’s syntax looks a bit funky at first. And then, eventually, it hits you: this language was not designed to make it easy for you to read and write it. It isn’t designed to make it fast for you to understand what’s written. It’s instead designed to reduce the amount of lookahead the compiler has to do, to simplify the amount of work parsing takes.

Why do I have to type “type Foo struct” rather than just “Foo struct”? The latter is consistent with the little-endian nature of Go, where types follow variables. But if you had “Foo struct” and “bar func”, that would increase the amount of lookahead the compiler had to do. Similarly, with functions, Go could have followed a strategy similar to the C family of syntaxes. But that would require more lookahead to implement.

It’s certainly not to help me read things faster. Remove the “func” and “type” keywords and I can read code just as fast. I can write it slightly faster. It’s only for the compiler’s benefit that I have to write these keywords.

This is backwards. This is perverse. A team of five to ten individuals decided they wanted to do slightly less work, so everyone else has to do more work. We pay people big money to spend more effort so a lot of people can do slightly less, and we think that’s valuable. We think it’s a good tradeoff. But here we get the exact opposite treatment and people seem to love it. I don’t understand.

There are other problems I have with the syntax. The compiler requires a := to create and initialize a new variable, while it just uses = for an assignment to an existing variable. There’s a special variable, _, which indicates “throw this value away”.

However, there are two other variable names that you will reuse very, very often: err and ok. err is the default variable name (by the documentation, not by language features) for an error. Many things return errors in addition to something else, and most of the time you’ll write something like val, err := tryGetValue(). It would be awesome if you could use := when reusing at least the ‘err’ variable.

I’m thinking of pre-declaring at least err so I can always use = for it, but I don’t think that would save me thanks to multiple return values.

All in all, this looks like two features that seem great in isolation (different syntax for declaring with initialization versus assignment, added to multiple return values) not working together very well in practice. But I’ve never even seen anyone use multiple return values aside from returning an error with a single value, so…

Also, Go says that all loops are special cases of for loops. You create an infinite loop with for { doStuff() }. You create a while loop with for booleanExpression { doStuff }. This hides programmer intent. Not ideal.

Constant initialization with iota is magic. You can write:


const (
  B = 1 << (iota * 10)
  KB
  MB
  GB
  TB
)

This gives you the constants you would expect given the names. The thing to remember is that iota auto-increments each time you use it, and a constant without an initializer acts as if you had copied and pasted the previous constant’s initializer… The first time I read this sort of code, I had no clue what it meant. (Also, it started with _ = iota, which confused things slightly more.) I thought I’d get sequential values, like every other language, incrementing from the previous given value. Or, if the language were especially clever, equal increments.

Magic is only good for making a programmer feel clever.

(As an aside, I praised D for its concurrency earlier. The standard library contains a lot of code written by someone who likes feeling clever. This means you have to write things like dur!"msecs"(15) rather than a more sensible construct like Duration.fromMillis(15). Even though I don’t have to modify that code, I depend on it, so I have to spend effort to understand an API expressed in templates and metaprogramming rather than simpler constructs.)

Shadowing declarations

We spoke a few moments ago about the problems with multiple assignments. Here’s a kicker: every time you create a new scope (which is, roughly, every time you have a new set of curly braces), you can freely shadow declarations.

What does that mean? Well, let’s take this snippet dealing with Redis:


var cursor int64 = 0
for {
  cursor, keys, err := redis.Scan(cursor, "prefix:*", 10)
  if cursor == 0 {
    break
  }
  // ...
}
if cursor != 0 {
  // We stopped early. Do something special.
}

Redis’s SCAN call takes an input cursor indicating where to start and emits an output cursor indicating where to start next time. Simple, right? Obviously correct. Wrong.

You created a new variable named keys. But you can’t separate that one variable’s new-ness from the other variables. Go assumes that you want to make all the variables anew. So instead of doing the right thing, updating cursor each time through the loop, you get a brand new variable.

Either Go will create and initialize the inner cursor to 0 each time, lift it above the point of declaration, etc; or it will use the cursor variable from above, never updating it. Either way, you’ll process the first set of values over and over again forever.

Conclusions

People mock Javascript for its awfulness, but Go isn’t far behind. Use Go instead of Node.js if you want, but since there are bajillions more Javascript devs than Go devs, you’d be better off using Node.js. Dart’s even significantly more popular than Go, so if you want static typing, that’s an option. (Or you can use TypeScript with Node.js, but you still have to deal with a lot of JS’s oddities.)