The site uses cookies that you may not want. Continued use means acceptance. For more information see our privacy policy.

What is Data Infrastructure?

In place of hot air about the definition of infrastructure, some thoughts about data infrastructure.

Data infrastructure is something like a road, or maybe a bridge. It’s a set of standards that allows data systems and consumers of data to operate without too much effort. DNS, the domain system, is infrastructure, translating readable names into numeric addresses and adding a layer of abstraction to things so that addresses can change. With paved land and spans, you don’t have to load sacks on a burro to carry goods through an untamed land, up and down hills, fording rivers, gazing at the stars and bouncing from landmark to landmark to find your way. Data infrastructure should work the same way. The modern internet needs more data infrastructure.

In email, we have the problem of spam. To deal with spam, many systems have been developed to filter it, to block spammers, and so on. It isn’t perfect, but it generally works. But we haven’t generalized that infrastructure to other problems. Some things, like anti-virus scanning of mail, uses some of the same systems, but each is generally its own thing. Other places, we see ad blockers and browsers have lists of malicious websites to warn users. Some data infrastructure exists here, but could be more generalized.

Other places, we see no real data infrastructure. Several times per year I read about journalists and others who are targets of harassment campaigns online. The social systems lack the same kind of filter technologies that email has. But it could be generalized. It should be generalized.

Identity, the ability to create a digitally-signed identity and authenticate with websites, would be a great and welcome form of data infrastructure. It has risks if it lets government censors snuff out dissent, but that already happens too often. A correctly described identity system would allow for multiple identities or multiple expressions of an identity depending on where and how it’s used.

The advertising industry already tries to create identity tokens, but users have limited control over them. Some laws get passed to try to give control, and now every website has to tell you about cookies, but you still don’t have control, you just have an extra piece of cyber garbage floating atop every website.

Infrastructure lowers friction. Building websites has a key barrier: user-sign-up. The easier it becomes for a user to sign-up, the less advantage incumbents have. That is paramount for competitiveness in many online spaces. This and other barriers are the sort that data infrastructure should break down, in the same way that the transcontinental railroad and other major infrastructure projects opened up lands for new cities and new economies.

Other commerce-related data infrastructure changes would be welcome. One is simple resource links that are platform-agnostic. They would let you link to a song without linking to a particular music service. Or link to a video game without pointing at a specific store. That kind of infrastructure helps to allow competition without forcing fans to show favorites or act as advertisers without their consent.


As the internet matures, opportunities arise to define and build data infrastructure. It took humans thousands of years to figure out roads and city planning (and we still get it wrong sometimes), but as we settle into patterns of use and behavior, and as we continue to have more data capacities, we need to be looking at ways to generalize our tools into outright infrastructure that supports the smooth flow of activities of all sorts.

What are the internet’s rivers that we can send boats down? Where are its mountains, requiring us to seek passes or blast tunnels? What will be the critical pieces of infrastructure that let computers do more work, rather than pretend to be a fancy form of paper?

Re: FW: [JOKE] Thoughts About Memes

Why are memes so prevalent?

Back in the 1990s and early 2000s there was a phenomenon of older relatives sending quackery to peoples’ email. These were wide in variety, including Neiman Marcus cookie recipe spam (Snopes: 3 November 1999: “Is the Neiman Marcus Cookie Story True?”), captioned images, and jokes, and they were almost always a transcript of forwards from across the internet, lasting for years and years.

And lots of them were political, and they were corny, and why did said relative have your email address, anyway?

But history likes to trick us. It likes to take a thing and twist it around and spit it back at us. So the same dreck that clogged our inboxes was inexplicably made cool once everyone left email in favor of Facebook and other social media platforms. The meme was born.

I don’t know what it means. Surely others have noted this FWD-to-meme evolution and how the former was as uncool as could be and the latter is seen as a form of net-cred. My best guess is that the elders impersonated youngsters on various zines and boards and whatevers, disguising their forward spam as coming from fellow youths. Now we have politicians memeing it up on their Twitter accounts, and nobody is running away from the damned things as last-year or overdone.

What, just because they’re funny?! Laughing gas is funny, too, but you don’t see people sending laughing gas over the internet!


Memes have always existed. Once upon a time folks would clip memes from the funny pages or newspapers or magazines. But they were always on the kitschy end of the thing, not some everyday, always-on device that would overrun real discussions.

These days, serious posts have the replies jammed full of videos of people making reactive faces. Use your words, people! I always ask myself, are there really people who go through and watch all those videos, anyway? God knows.

Some people had Monty Python and the Holy Grail memorized. I’m sure such people still exist, with different source material. On the other hand, the Christians and Jews and Muslims have been line-and-versing their memes out for centuries.

It strikes me as odd that we have this kind of short-circuit in our brains that says if you can encapsulate some idea in this trendy way, it suddenly takes on some special character. Like an advertising jingle that gets stuck in your head.


There are various possibilities for the rise of memes. One is that it’s platform metrics that drive them. Engagement, the mere reply or acknowledgment of a piece of content, is seen as key. Memes are a cheap way to engage, and the platforms like that.

There are others that say in our hyperconnected world nobody has time to think. Busy Twitch chats are full of stamp spam because nobody could usefully converse at 1000 lines/second. On the other hand, someone’s got time to make all those fancy plates of food showing up on Instagram (or are they just output from a generative adversarial network?).

One other possibility is they are a sign of the singularity. That as culture sublimates into the digital realm, human interactions become more and more patterned upon how consciousness directly relates the world to itself, with very id-based reactions to everything, and therefore the expressivity of a networked world naturally devolves into visceral-first communications.

Who knows?

Growing the Net With or Without Neutrality

Not about network neutrality as much as the need to grow the Internet regardless.

I was going to write in favor of classifying telecoms as telecoms (i.e., making them common carriers). But I think it misses the point, which is: how do we as a society best ensure the natural and unfettered growth of the Internet?

I will mention the issue of Network Neutrality briefly. It is the concept that a telecom or Internet service provider (ISP) must provide equal “best-effort delivery” of all data going across its network. That it cannot and must not give special treatment based on private contracts with the endpoints.

But it appears that the argument ought be moot. Consider the total value of commerce taking place over the Internet. We ought to leverage that massive sum: a cursory search estimated $8 trillion USD globally in 2011. I am not particularly big on taxation, but a modest tax (or fee) from that could easily fund massive development of the network.

Content delivery is probably only a minute portion of the total revenue, with online sales and advertising being larger pieces. But regardless, the Internet is making a lot of money for a lot of industries, all of which can benefit from a faster, broader Internet.

Does Network Neutrality harm or help convince that investment? Does the oligopoly of telecoms in the USA help or hinder? Consider the best analog to the Internet: roads.

We do differentiate some road traffic. We let police vehicles and emergency vehicles ignore certain signals under certain circumstances. We also have carpool lanes and lanes for energy-efficient vehicles, to promote conservation of the resource. We have weigh stations for commercial traffic on major roadways to pay for higher upkeep. We also have some toll roads.

But the average person (who belongs to the car class; obviously the current, dominant transportation system has its own glaring problems, which I am ignoring here) can still get in their vehicle and cross the country without much hassle. Certainly without their car’s manufacturer or their hotel or anyone paying for the privilege.

That core system should be sanctified through law or regulation. The core ability to engage the network is fast becoming a recognized natural right.

But what about those other caveats of roads? Do they have a place on the Internet?

I think they have a place, if and only if we do enshrine neutrality into the basic system, and if and only if they are clearly differentiated and regulated. Moreover, the speed limits of physical roads do not apply to networks.

It would be absurd to fix the common path at some speed, or even to build it into law or regulation that could be neglected. The common speed should be growing as long as technology allows. At some point we will undoubtedly have no more need for the analog’s extras. That in 100 years time, could we still possibly need toll roads for network traffic? Doubtful.

No, the question we face is one of growth and economics, not of legacy. I believe if a Netflix or a Valve wants to pay to send their data to someone in excess of that person’s natural connection speed (the natural speed ought be regulated/protected from economically artificial tampering by ISPs), that’s perfectly reasonable so long as they have every right not to make that deal and still receive best-effort delivery.

In other words, if we can have enhancement-only partiality, strictly regulated, and possibly taxed, it may be acceptable.

The main caveat is the lack of regulatory agencies that have the chutzpah to actually prosecute malfeasance. Without that, the whole thing is a wash.