Getting Ready for DWeb Camp: A Conversation with Kelsey Breseman

Posted on July 9, 2019 by francessawyer

Earlier this week, I spoke with Kelsey Breseman, a rockstar engineer and entrepreneur working to solve climate change, protect public access to scientific data, and build a better web. Equal parts concrete problem solver and utopian dreamer, in her spare time, she wanders the forests north of Seattle and revels in VERY long walks.

In July, she will be leading a workshop at the Decentralized Web Camp. Here’s our conversation, edited for length and clarity.

Kelsey, fantastic to meet another climate nerd working on the Decentralized Web. Thanks for speaking with me!

Let’s start with climate. You’re currently working on a book to introduce engineers, entrepreneurs and other change-makers to the subject, as well as working at an environmentally-focused nonprofit. When did you realize this was what you needed to work on?

I had a classic tech start up after college. It was a very ‘S.F. Bay Area young engineer’ feeling. I was pulled into a job, it could have been a career, I was making money — but I wasn’t satisfied.

Part was the hours. It wasn’t physically good to my body. Part was that I was doing user research, and the demographics were not the demographics I was interested in serving. Maybe, I thought, the tool we were making wasn’t that transformative. It was not a bad tool or bad community, but I want to spend the majority of my time on something that really matters.

I did the thing where I quit without a plan.

I wanted to find something to work on that would have an impact, and something I might be good at. Climate change was the obvious direction — it’s really big and it needs a lot of different initiatives, including engineering. It needs different people and different solutions, all acted upon at once. Climate change is a set of enormous shifts that will happen globally. Entrepreneurship thrives where things are changing.

I had a few different ventures in the climate entrepreneurship space, but though this was values-aligned, I kept hitting the same issues as before in terms of physically wearing myself down. So I was really pleased to stumble across a listing at EDGI, the Environmental Data and Governance Initiative, asking for remote, part-time work for someone experienced in engineering, open source, project management, and taking initiative. The non-profit has much better leverage and contacts than I could make on my own in terms of impacting climate policy, and I help the org design and execute on projects at the intersection of environment, technology, governance, and justice. And I’m finally able to balance my time to do everything else: keep bees, bake sourdough, find intense physical adventures, and volunteer with activist movements.

How does it all intersect with the Decentralized Web? How did you get connected to this big, ambitious project?

I found out about DWeb a few years ago, but my biggest involvement is through EDGI. EDGI started around concerns that the United States government could decrease access to public environmental data, especially data that they produce, particularly in politically motivated ways. As a starting point, the group of academics, volunteers, and otherwise ordinary citizens who became EDGI coalesced around the mission to —just in case motives get wonky— ensure that as much of that data as possible was archived somewhere. Then the next step was to think about how to increase access to that data.

That next step centers the question: “What it looks like to have an unbiased approach to data ownership?”

One of the most interesting efforts at EDGI is the Data Together project. We’re interested in Decentralized Web — how people own the data they need and use. We bring together people building DWeb protocols to enable data storage and answering, what does it mean to create virtual citizenship in that space? That’s what is bringing me to camp.

So you think a lot about what being a “good citizen” means on the web. What does that look like?

Being a good citizen — we as Americans see our civic duty as voting and taxes, and being “productive” in the sense of having a job, and that’s all. That hasn’t always been the case; a truly committed citizenship is more.

Data Together is largely hyper-educated tech people. Together, we discuss ways to design tech to be good stewards of data. This is informed by EDGI’s broader work, including a formulation of Environmental Data Justice, and also informs EDGI’s work, especially in archiving.

I don’t think we’ve come up with conclusions, but the act of talking to each other about ethics and values matters. Centering conversations around what are we trying to do as ‘citizens for good’ is important and massively useful.

We look, for example, at case studies where people thought they were being unbiased and fell short. For example, Bitcoin was designed to be just technology. No policy, no society, etc. But because of voting based on mining ownership, which were owned by those with capital, they couldn’t actually get away from power structures.

DWeb Camp, to me, is a place where we can practice this active, creative kind of citizenship. The radical act of gathering together in nature, setting up our own infrastructure for a week, and asking these big questions of each other.

I’ve always had a weakness for utopian society. That feeling that we might be creating something fundamentally new, and deciding what the rules will be.

Gatherings where you bring folks together in space is to foster connection. The meaningful casual interactions — sharing food, seeing who wants to stay out late and look at the sunset, making space to be human together— create a motivation to work together on the technology.

DWeb Camp is rooted in the idea of intentional community. How might we engage with data, with money, with people? How do we do that in a way that creates a different world? I’ve been describing DWeb Camp as Burning Man for nerds. I don’t know what to expect — and anything you can be excited about without knowing what to expect is cool.

Exactly! And right now, DWeb Camp is full potential, energized by a remarkable set of thinkers and engineers who are bringing it to life.

I’ve spent a lot of time volunteering on open source projects. The technology may be what draws you in, but it is the community that keeps you involved. You show up on a call because you want to see the people and share in their work. As Liz Barry said on a recent Data Together call, our polity is the set of people with whom we can share dreams.

This is a gathering of those dreamers.

So thinking about what will happen at camp when everyone is gathered together, you’re offering a workshop that you’re calling a “Technical Salon.” What’s the plan and why should people attend?

I have wanted to give this workshop for years. I’m really interested in communities and how to foster a sense of connection between strangers— so this is an experiment.

In the technical salon, you don’t start with your name, where you’re from, or where you work. Instead, you put three things you’re interested in talking about on your name-tag. You come into the space with, more or less, your heart on your sleeve to declare what you want to talk about.

My hope is that this will help people to connect more deeply, more vulnerably, right away, by meeting immediately over the things that matter to them.

I’ll certainly be there, and am sure others will too.

Thanks so much for sharing your work on climate, DWeb, citizenship and more. See you later in July!

If you would like to join Kelsey and other marvelous thinkers at DWeb Camp, learn more and sign up here. July 18-21 at a Farm near Pescadero, CA.

________________________

Kelsey Breseman is an engineer, entrepreneur, and community builder. She spends as much time as possible outside in the woods, thinking about and experimenting with different ways to save the world.

Most 20th Century Books Unavailable to Internet Users – We Can Fix That

Posted on July 1, 2019 by Brewster Kahle

The books of the 20th century are largely not online. They are mostly not available from even the biggest booksellers. And, libraries who have collected hard copies of these books have not been able to deliver them in a cost-efficient, simple, digital form to their patrons.

The way libraries could fill that gap is to adopt and deliver a controlled digital lending service. The Internet Archive is trying to do its part but needs others to join in.

The Internet Archive has worked with 500 libraries over the last 15 years to digitize 3.5 million books. But based on copyright concerns the selection has often been restricted to pre-1923 books. We need complete libraries and comprehensive access to nurture a well-informed citizenry. The following graph shows the number of books digitized by the Internet Archive, binned by decade:

Up until 1923 the graph shows our collection increasing and mirroring the rise in publications.Then it dips and slows because of concerns and confusion about copyright protections for books published after that date. It picks up again in the 1990s because these books are more readily available and separate funding has helped us digitize some recent modern books Nevertheless, the end result is that the gap is big – the digital world is missing a huge chunk of the 20th Century.

Users can’t even fill that gap by buying the books from that time period. According to a recent paper by Professor Rebecca Giblin, the commercial life of a book is typically exhausted 1.4 to 5 years from publication; some 90% of titles become unavailable in physical form within just two years. Most older books are therefore not available to be purchased in either physical or digital form. The following graph, pulled from a study by Professor Paul Heald, shows books by decade that are available on Amazon.com. It shows that the world’s largest bookseller has the same huge gap – the 20th century is simply missing.

The 20th Century represents a significant portion of published knowledge – approximately one-third of all books – as shown in the graph below. These books are largely unavailable commercially, BUT they are not completely lost. Many of these books are on library shelves, accessible only if you physically visit the library that owns those books. Even if you’re willing to visit, those books might still not be accessible. Libraries, pressed to repurpose their buildings, have increasingly moved volumes to off-site storage facilities.

The way to make 20th Century books available to library patrons is to digitize those books and let every library who owns a physical copy lend that book in digital form. This type of service has come to be known as controlled digital lending (CDL). The Internet Archive has been doing this for years. We lend out-of-copyright and in-copyright volumes that we physically own. We’ve reformatted the physical volume, produced a digital version and lend only that digital version to one user at a time. Our experience shows that this responds to a real demand, fills a genuine need satisfactorily, gives new life to older books, and brings important knowledge to a new audience. Check out this case study for CDL involving the book Wasted which figured prominently in the Brett Kavanaugh Supreme Court nomination hearings.

Our experience has been replicated by other early adopters and providers of a CDL service. Here’s a list of some of them. We believe every library can transform itself into a digital library. If you own the physical book, you can choose to circulate a digital version instead.

We urge more libraries to join Open Libraries and lend digitized versions of their print collections, making more copies of books available for loan and getting more books into the hands of digital readers everywhere.

Internet Archive Responds to UK Online Harms White Paper

Posted on July 1, 2019 by Lila Bailey

The United Kingdom has proposed a broad new regulatory framework for dealing with harmful content online in its Online Harms White Paper. The Internet Archive is concerned that the new framework could have problematic unintended consequences for digital libraries.

Below is our full response:

Introduction

The Internet Archive, a US-based 501(c)(3) non-profit, is building a digital library of Internet sites and other cultural artifacts in digital form. Like a paper library, we provide free access to researchers, historians, scholars, people with print-disabilities, and the general public. Our mission is to provide Universal Access to All Knowledge.

We appreciate the opportunity to weigh in on the important question of how to manage harmful content online. We believe the web has been an amazing boon to society by democratizing access to knowledge and culture, but we recognize some harms are very real. We therefore urge the government to proceed carefully with regulation.

Our response deals with two aspects of the UK government’s plans for regulating online harms: (1) the online services considered within the scope of the regulatory framework and (2) a suggested approach to accountability and transparency.

Nonprofit Libraries Should Not Be Within the Scope of the Regulatory Framework

Section 4 of the Online Harms White Paper describes the scope of the regulatory framework as applying to “companies that provide services or tools that allow, enable or facilitate users to share or discover user-generated content” including “non-profit organisations.” This scope is overly broad and would sweep in non-profit digital libraries and archives.

Historically, libraries and archives have not been regulated under the same rules as for-profit media organizations. For good reason–libraries have a fundamentally different role in society from commercial media companies. Libraries seek to fulfill a range of vital public interest goals: ensuring widespread access to knowledge, promoting literacy and learning, ensuring equity of access, and stewarding their communities’ cultural and literary heritage. Increasingly, knowledge and cultural heritage is created and shared online. In response, libraries are also moving online. This fact should not subject them to the same rules and burdens as for-profit media and social media companies.

Although libraries are moving online, their fundamental role in society remains the same. Libraries have always supported the individual’s right to be informed, to receive accurate and truthful information, as well as to seek, receive and impart ideas of all kinds–including dangerous or unpopular ones. Libraries also support literacy and help individuals learn to assess the veracity of information in front of them. In our current digital information ecosystem, filled with deception and misinformation, libraries play an important role in empowering an informed citizenry. A vague “duty of care” standard could stifle libraries from achieving their vital public service mission. For these reasons, we believe libraries and archives should be clearly excepted from the regulatory framework set forth in the White Paper.

The UK Government Should Support Transparency and Accountability via the Creation of a Restricted Access Archive of Removed Content

While our mission is Universal Access to All Knowledge, we recognize that some kinds of information can be so dangerous as to warrant being restricted to a limited set of people.

Colloquially, libraries, archives, and museums use the term “giftschrank,” meaning “poison cabinet” to refer to an area where sensitive or potentially harmful materials are stored. This can take the form of a secret reading room that is off-limits to the general public and only those with special, scholarly permission are allowed access.

A “giftschrank” for collecting the materials that have been removed from company websites, either by reason of a legal removal request, or because the material violated the company’s own rules, could be another role for libraries to serve in the digital information ecosystem. While these materials may be harmful or dangerous to the general public, it remains vitally important for us as a society to nevertheless be able to study them. It is also important to have transparency into what kinds of materials are being removed, and what impact such removal may have on different communities. A giftschrank could help, and the Internet Archive is in a strong position to be a host institution for such an archive.

We therefore suggest that the government support the creation of a giftschrank of harmful materials removed from the internet. Some obstacles to building this include fear of potential liability for hosting the material. The government could help by limiting liability for good faith efforts. Another barrier is uncertainty around what materials should be included and who should have access. The government could help by convening a discussion with the appropriate stakeholders. Finally, funding would be necessary. The government could help either by directly providing the funds or by providing other financial incentives.

Getting Ready for DWeb Camp: Defining Our Terms

Posted on June 30, 2019 by francessawyer

This is a guest post by Lawrence Wilkinson and Richard Whitt summarizing a conversation they led at the 2018 Decentralized Web Summit on the topic of the language and terminology we use to talk about the Decentralized Web.

“What do we mean exactly by X?”

As the conversation around the Decentralized Web has evolved, one persistent line of questions has been around the very definition of language we commonly use.

Over the course of the Builder’s Day, and the two days of the Decentralized Web Summit 2018, a group came together to parse this shared language. The terms “decentralized,” “federated,” and, perhaps especially, “open,” amongst others, were proffered for conversation, and the participants provided some excellent input.

The goal was not to codify or limit ourselves. One discussant made the excellent point that some ambiguity is not a negative thing at this relatively early stage of development. It was labelled a “Goldilocks problem” — too much ambiguity, and there is a failure to communicate and difficulty to cooperation; too little ambiguity, and there is little room for experimentation and happy accidents. The point, shared by others, is that we should want just enough of a shared understanding to allow us to move forward together, but still retain enough ambiguity to allow for innovation.

The Right Questions

By and large, there was agreement that we cannot all agree on what various terms mean with great precision. For example, for some the concept of being “decentralized” was the endgame, a goal unto itself. The consensus seemed to be, however, that “decentralization” refers variously to resources, or to functions, or to governance, and that each of these connotations in turn was a means to a larger objective. By contrast, the term “openness” generally was seen as the end goal — the why — while decentralized systems and protocols are the tools, the means — the how.

When the several groups worked through the nuances, there appeared to be some agreement on the following set of categories based upon a specific line of questions that were being asked: in essence, what role is the particular term or concept intended to serve?

What

Web, Internet, Network, Protocol, Application, Blockchain, Server, End Point, Holographic Storage

How

Decentralized, Distributed, Federated, Interoperable, Self-steering, Generative tensions, Immune system response, Communication

Who

Users, Communities, Governance, Community-Governed, Builders, Founders, Ecosystem

Why/To What Effect/To What End

Privacy, Security, Open, Agency, Trust, Sustainable, Scalable, Global Consensus, Alignment, Provenance, Permissionless, Same opportunity for everyone, Tyranny, Manipulation, Stalked, Facts, Competitive with Centralized Systems, Who Loses, Self-Sovereign Data, User-Controlled, Effects on/For Users, Risks of Current Situation, Unrealized Benefits

“Constructive Conflict” as a Feature, Not a Bug

Our glossary of terms, once our group established roughly which question a given term sought to answer, took a fascinating turn from trying to precisely define each term to doubling down on our shared sense that, at present, definitional confusion or fuzziness was less a bug and more a feature.

As the Decentralized Web endeavor matures into a field, the terminology will settled into firmer and more widely-accepted definitions. But in the meantime, such perspective creates a “space” in which different understandings of a term can generate constructive conflict and lead to surprising advances.

Ambiguity, in this instance, is the friend of creativity, encouraging questioning, learning, and innovation. And at this stage in the development of the D Web, creation is the main event.

EDITORS NOTE: We will continue to wrestles with these issues at DWeb Camp, July 18-21, 2019. We hope you will join us.

Lawrence Wilkinson Wilkinson is Chairman of Heminge & Condell (H&C), an investment and strategic advisory firm. Through H&C, Lawrence is involved in venture formation work, and as a director and counselor to a number of companies that he helped create over the years, among them: Wired, Oxygen Media, Broderbund Software, Ealing Studios, Colossal Pictures/USFX, Design Within Reach, and Public Bikes. As co-founder and president of Global Business Network (GBN), Lawrence helped develop and spread the scenario planning approach to long-term planning, now one of the most widely-used techniques by organizations globally; he continues to offer strategic counsel to a number of corporate clients, NGOs, and governments around the world. His recent work includes leading strategy projects for The Internet Archive, Mozilla, Wikimedia, EFF, Code for America, KQED, and Public Radio (NPR/PRI/CPB), all focused on the Future of Civil Discourse. He serves as Chair of The Institute for the Future (IFTF), Vice-Chair of Common Sense Media (which he co-founded), and a director of Landesa, Public Radio International, Public Architecture, and The Global Lives Project; as an advisor to The Library of the Future Project at The Bodleian Library, Oxford; as a Visitor at Harvard University Libraries; and as a Fellow of the MIT Center for Transportation and Logistics. He is a graduate of Davidson College, Oxford University, and Harvard Business School.

Richard Whitt Whitt is an experienced corporate strategist and technology policy attorney. Currently he serves as Fellow in Residence with the Mozilla Foundation, and Senior Fellow with the Georgetown Institute for Technology Law and Policy. As head of NetsEdge LLC, he advises companies on the complex governance challenges at the intersection of market, technology, and policy systems. He is also president of the GLIA Foundation, and founder of the GLIAnet Project.

Richard is an eleven-year veteran of Google (2007-2018). Most recently he served as Google’s corporate director for strategic initiatives, working with Vint Cerf, Hal Varian, and other Googlers on policy and ethical issues related to Internet of Things, machine learning, broadband connectivity, digital preservation, and other emerging technologies. A notable achievement was negotiating successfully with the Cuban government for permission to build the country’s first free public WiFi hotspot for Internet access. From 2012 to 2014, Richard was chosen by Google management as the Corporate Vice President and Global Head of Public Policy at newly-acquired Motorola Mobility.

Prior to his executive role with Motorola, Richard served as Google’s director and managing counsel for federal policy, overseeing strategic thinking on privacy, cybersecurity, intellectual property, Internet governance, and free expression. Previously he led the Company’s substantive advocacy on issues such as network neutrality, broadband deployment, “unregulation” of Internet applications, and spectrum policy. In particular he headed up Google’s open Internet policy on a global basis, guided the Company’s participation in the FCC’s 700 MHz auction, helped secure TV White Spaces spectrum allocation, and collaborated on the nationwide launch of Google Fiber.

Before joining Google in 2007, Richard spent twelve years in the legal department at MCI Communications. He most recently headed up MCI’s DC office as vice president for federal law and policy.

Two Thin Strands of Glass

Posted on June 29, 2019 by robla

There’s a tiny strand of glass inside that thick plastic coat.

Two thin strands of glass. When combined, these two strands of glass are so thin they still wouldn’t fill a drinking straw. That’s known in tech circles as a “fiber pair,” and these two thin strands of glass carry all the information of the world’s leading archive in and out of our data centers. When you think about it, it sounds kind of crazy that it works at all, but it does. Every day. Reliably.

Except this past Monday night, here in California…

On Monday, June 24, the real world had other ideas. As a result, the Internet Archive was down for 15 hours. For Californians, this was less of a big deal: those 15 hours stretched from mid-Monday evening (9:11pm on the US West coast), to 11:51am on Tuesday. Many Californians were asleep during several hours of that time. But in the Central European time zone (e.g. France, Germany, Italy, Poland, Tunisia), that fell on early Tuesday morning (06:11) to mid-Tuesday evening (21:51). And in the entire country of India, it was late Tuesday morning (09:41) to just after midnight on Wednesday (00:21).

Continue reading →

A Deep Dive into Openness

Posted on June 29, 2019 by francessawyer

Laying a Shared Foundation for a Decentralized Web

In this guest post, Richard Whitt builds on his prepared remarks from the “Defining Our Terms” conversations at the Decentralized Web Summit in 2018. The remarks have been modified to account for some excellent contemporaneous feedback from the session.

Some Widespread Confusion on Openness

So, we all claim to love the concept of openness — from our economic markets, to our political systems, to our trade policies, even to our “mindedness.” And yet, we don’t appear to have a consistent, well-grounded way of defining what it means to be open, and why it is a good thing in any particular situation. This is especially the case in the tech sector, where we talk at great length about the virtues of open standards, open source, open APIs, open data, open AI, open science, open government, and of course open Internet.

Even over the past few months, there are obvious signs of this rampant and growing confusion in the tech world.

Recently, former FCC Chairman, and current cable lobbyist Michael Powell gave a television interview, where he decried what he calls “the mythology… of Nirvana-like openness.” He claims this mythology includes the notion that information always wants to be free, and openness is always good. Interestingly, he also claims that the Internet is moving towards what he called “narrow tunnels,” meaning only a relative few entities control the ways that users interact with the Web.

Here are but three recent examples of tech openness under scrutiny:

Facebook and the data mining practices of Cambridge Analytica: was the company too open with its data sharing practices and its APIs?

Google and the $5B fine on Android in the EU: was the company not open enough with its open source OS?

Network neutrality at the FCC in its rise and fall, and rise and now again fall. Proponents of network neutrality claim they are seeking to save an open Internet; opponents of network neutrality insist they are seeking to restore an open Internet. Can both be right, or perhaps neither?

The concept of openness seems to share some commonalities with decentralization and federation, and with the related edge/core dichotomy. To be open, to be decentralized, to be on the edge, generally is believed to be good. To be closed, to be centralized, to be from the core, is bad. Or at least, that appears to be the natural bias of many of the folks at the Summit.

Whether the decentralized Web, however we define it, is synonymous to openness, or merely related in some fashion, is an excellent question.

The Roots of Openness

First, at a very basic level, openness is a very old and ancient thing. In the Beginning, everything was outside. Or inside. Or both.

A foundational aspect of systems is the notion that they have boundaries. But that is only the beginning of the story. A boundary is merely a convenient demarcation between what is deemed the inner and what is deemed the outer. Determining where one system ends and another ends is not such a straightforward task.

It turns out that in nature, the boundaries between the inner and the outer are not nearly as firm and well-defined as many assume. Many systems display what are known as semi-permeable boundaries. Even a human being, in its physical body, in its mental and emotional “spaces,” can be viewed as having extensible selves, reaching far into the environment. And in turn, that environment reaches deep into the self. Technologies can be seen as one form of extension of the properties of the physical and mental self.

The world is made up of all types of systems, from simple to complex, natural to human-made. Most systems are not static, but constantly changing patterns of interactions. They exist to survive, and even to flourish, so long as they gain useful energy in their interactions with their environments.

“Homeostasis” is a term describing the tendency of a system to seek a stable state by adapting and tweaking and adjusting to its internal and external environments. There is no set path to achieving that stability, however — no happy medium, no golden mean, no end goal. In fact, the second law of thermodynamics tells us that systems constantly are holding off the universe’ relentless drive towards maximum entropy. Only in the outright death of the system, do the inner and the outer conjoin once again.

Human beings too are systems, a matrix of cells organized into organs, organized into the systems of life. The most complex of these systems might be the neurological system, which in turn regulates our own openness to experience of the outside world. According to psychologists, openness to experience is one of the Big Five personality traits. This is considered a crucial element in each of us because it often cuts against the grain of our DNA and our upbringing. Surprise and newness can be perceived as a threat, based on how our brains are wired. After all, as someone once said, we are all descendants of nervous monkeys. Many of our more adventurous, braver, more open forebears probably died out along the way. Some of them, however, discovered and helped create the world we live in today.

From a systems perspective, the trick is to discover and create the conditions that optimize how the particular complex system functions. Whether a marketplace, a political system, a community, the Internet — or an individual.

Second, it may be useful to include networks and platforms in our consideration of a systems approach to openness.

There is no firm consensus here. But for many, a network is a subset of a system, while a platform is a subset of a network. All share some common elements of complex adaptive systems, including emergence, tipping points, and the difficulty of controlling the resource, versus managing it.

The subsets also have their own attributes. So, networks show network effects, while platforms show platform economic effects.

From the tech business context, openness may well look different — and more or less beneficial — depending on which of these systems structures is in play, and where we place the boundaries. An open social network premised on acquiring data for selling advertising, may not be situated the same as an open source mobile operating system ecosystem, or Internet traffic over an open broadband access network. The context of the underlying resource is all-important and as such, changes the value (and even the meaning) of openness.

Third, as Michael Powell correctly calls out, this talk about being open or closed cannot come down to simple black and white dichotomies. In fact, using complex systems analysis, these two concepts amount to what is called a polarity. Neither pole is an absolute unto itself, but in fact exists, and can only be defined, in terms of its apparent opposite.

And this makes sense, right? After all, there is no such thing as a completely open system. At some point, such a system loses its very functional integrity, and eventually dissipates into nothingness. Nor is there such a thing as a completely closed system. At some point it becomes a sterile, dessicated wasteland, and simply dies from lack of oxygen.

So, what we tend to think of as the open/closed dichotomy is in fact a set of systems polarities which constitute a continuum. Perhaps the decentralized Web could be seen in a similar way, with some elements of centralization — like naming conventions — useful and even necessary for the proper functioning of the Web’s more decentralized components.

Fourth, the continuum between the more open and the more closed changes and shifts with time. Being open is a relative concept. It depends for meaning on what is going on around it. This means there is no such thing as a fixed point, or an ending equilibrium. That is one reason to prefer the term “openness” to “open,” as it suggests a moving property, an endless becoming, rather than a final resting place, a being. Again, more decentralized networks may have similar attributes of relative tradeoffs. This suggests as well that the benefits we see from a certain degree of openness are not fixed in stone.

Relatedly, a system is seen as open as compared to its less open counterpart. In this regard, the openness that is sought can be seen as reactive, a direct response to the more closed system it has been, or could be. Open source is so because it is not proprietary. Open APIs are so because they are not private. Could it be that openness actually is a reflexive, even reactionary concept? And can we explore its many aspects free from the constraints of the thing we don’t wish for it to be?

Fifth, openness as a concept seems not to be isolated, but spread all across the Internet, as well as all the various markets and technologies that underlay and overlay the Internet. Even if it is often poorly understood and sometimes misused, openness is still a pervasive subject.

Just on the code (software) and computational sides, the relevant domains include:

Open standards, such as the Internet Protocol
Open source, such as Android
Open APIs, such as Venmo
Open data, such as EU Open Data Portal
Open AI, such as RoboSumo

How we approach openness in each of these domains potentially has wide-ranging implications for the others.

“Open” Source

One quick example is open source.

The Mozilla Foundation recently published a report acknowledging the obvious: there is no such thing as a single “open source” model. Instead, the report highlights no fewer than 10 different types of open source archetypes, from “B2B” to “Rocket Ship to Mars” to “Bathwater.” A variety of factors are at play in each of the ten proposed archetypes, including component coupling, development speed, governance structure, community standards, types of participants, and measurements of success.

Obviously, if the smart folks at Mozilla have concluded that open source can and does mean many different things, it must be true. And that same nuanced thinking probably is suitable as well for all the other openness domains.

So, in sum, openness is a systems polarity, a relational and contextual concept, a reflexive move, and a pervasive aspect of the technology world.

Possible Openness Taxonomies

Finally, here are a few proposed taxonomies that would be useful to explore further:

Means versus End

Is openness a tool (a means), or an outcome (an end)? Or both? And if both, when is it best employed in one way compared to another? There are different implications for what we want to accomplish.

The three Fs

Generally speaking, openness can refer to one of three things: a resource, a process, or an entity.

The resource is the virtual or physical thing subject to being open.
The process is the chosen way for people to create and maintain openness, and which itself can be more or less open.
The entity is the body of individuals responsible for the resource and the process. Again, the entity can be more or less open.

Perhaps a more alliterative way of putting this is that the resource is the function, the chosen process is the form, and the chosen entity is the forum.

For example, in terms of the Internet, the Internet Protocol and other design elements constitute the function, the RFC process is the form, and the IETF is the forum. Note that all these are relatively open, but obviously in different ways. Also note that a relatively closed form and forum can yield a more open function, or vice versa.

Form and forum need not follow function. But the end result is probably better if it does.

So, in all cases of openness, we should ask: What is the Form, what is the Forum, and what is the Function?

Scope of Openness

Openness also be broken down into scope, or the different degrees of access provided.

This can run the gamut, from the bare minimum of awareness that a resource or process or entity even exists, to transparency about what it entails, then to accessing and utilizing the resource, and have a reasonable ability to provide input into it, influence its operation, control its powers, and ultimately own it outright. One can see it as the steps involved from identifying and approaching a house, and eventually possessing the treasure buried inside or even forging that treasure into new being.

Think about the Android OS, for example, and how its labelling as an open source platform does, or does not, comport with the full scope of openness, and perhaps how those degrees have shifted over time. Clearly it matches up to one of Mozilla’s ten open source archetypes — but what are the tradeoffs, who has made them and why, and what is the full range of implications for the ecosystem? That would be worth a conversation.

Interestingly, many of these degrees of openness seem to be rooted in traditional common carrier law and regulation, going back decades if not centuries.

Visibility and Transparency: the duty to convey information about practices and services
Access: the norms of interconnection and interoperability
Reasonable treatment: the expectation of fair, reasonable, and nondiscriminatory terms and conditions
Control: the essential facilities and common carriage classifications

In fact, in late July 2018, US Senator Mark Warner (D-VA) released a legislative proposal to regulate the tech platforms, with provisions that utilize many of these same concepts.

Openness as Safeguards Taxonomy

Finally, openness has been invoked over the years by policymakers, such as the Congress and the FCC in the United States. Often it has been employed as a means of “opening up” a particular market, such as the local telecommunications network, to benefit another market sector, like the information services and OTT industries.

Over time, these market entry and safeguards have fallen into certain buckets. The debates over access broadband networks is one interesting example:

definitional — basic/enhanced dichotomy
structural —Computer II structural separation
functional — Computer III modular interfaces
behavioral — network neutrality
informational — transparency

In each case, it would be useful if stakeholders engaged in a thorough analysis of the scope and tradeoffs of openness, as defined from the vantage points of the telecom network owners, the online services, and the ultimate end users.

The larger point, however, is that openness is a potentially robust topic that will influence the ways all of us think about the decentralized Web.

Richard S. Whitt Whitt is an experienced corporate strategist and technology policy attorney. Currently he serves as Fellow in Residence with the Mozilla Foundation, and Senior Fellow with the Georgetown Institute for Technology Law and Policy. As head of NetsEdge LLC, he advises companies on the complex governance challenges at the intersection of market, technology, and policy systems. He is also president of the GLIA Foundation, and founder of the GLIAnet Project.

Identity in the Decentralized Web

Posted on June 28, 2019 by ravirosen

By Jim Nelson

In today’s world, why do platforms require so many accounts for a single person? (Courtesy of Jolocom)

In July of 2018, more than 1000 people gathered at the Decentralized Web Summit to share the latest decentralized protocols for the Web. Over three days, groups took deep dives into the “roadblock” issues we must surmount to reach scale, including identity. The following report by Jim Nelson explains what identity might look like in a decentralized world.

In B. Traven’s The Death Ship, American sailor Gerard Gales finds himself stranded in post-World War I Antwerp after his freighter departs without him. He’s arrested for the crime of being unable to produce a passport, sailor’s card, or birth certificate—he possesses no identification at all. Unsure how to process him, the police dump Gales on a train leaving the country. From there, Gales endures a Kafkaesque journey across Europe, escorted from one border to another by authorities who do not know what to do with a man lacking any identity. “I was just a nobody,” Gales complains to the reader.

As The Death Ship demonstrates, the concept of verifiable identity is a cornerstone of modern life. Today we know well the process of signing in to shopping websites, checking email, doing some banking, or browsing our social network. Without some notion of identity, these basic tasks would be impossible.

That’s why at the Decentralized Web Summit 2018, questions of identity were a central topic. Unlike the current environment, in a decentralized web users control their personal data and make it available to third-parties on a need-to-know basis. This is sometimes referred to as self-sovereign identity: the user, not web services, owns their personal information.

The idea is that web sites will verify you much as a bartender checks your ID before pouring a drink. The bar doesn’t store a copy of your card and the bartender doesn’t look at your name or address; only your age is pertinent to receive service. The next time you enter the bar the bartender once again asks for proof of age, which you may or may not relinquish. That’s the promise of self-sovereign identity.

At the Decentralized Web Summit, questions and solutions were bounced around in the hopes of solving this fundamental problem. Developers spearheading the next web hashed out the criteria for decentralized identity, including:

secure: to prevent fraud, maintain privacy, and ensure trust between all parties
self-sovereign: individual ownership of private information
consent: fine-tuned control over what information third-parties are privy to
directed identity: manage multiple identities for different contexts (for example, your doctor can access certain aspects while your insurance company accesses others)
and, of course, decentralized: no central authority or governing body holds private keys or generates identifiers

One problem with decentralized identity is that these problems often compete, pulling in polar directions.

For example, while security seems like a no-brainer, with self-sovereign identity the end-user is in control (and not Facebook, Google, or Twitter). It’s incumbent on them to secure their information. This raises questions of key management, data storage practices, and so on. Facebook, Google, and Twitter pay full-time engineers to do this job; handing that responsibility to end-users shifts the burden to someone who may not be so technically savvy. The inconvenience of key management and such also creates more hurdles for widespread adoption of the decentralized web.

The good news is, there are many working proposals today attempting to solve the above problems. One of the more promising is DID (Decentralized Identifier).

A DID is simply a URI, a familiar piece of text to most people nowadays. Each DID references a record stored in a blockchain. DIDs are not tied to any particular blockchain, and so they’re interoperable with existing and future technologies. DIDs are cryptographically secure as well.

DIDs require no central authority to produce or validate. If you want a DID, you can generate one yourself, or as many was you want. In fact, you should generate lots of them. Each unique DID gives the user fine-grained control over what personal information is revealed when interacting with a myriad of services and people.

If you’re interested to learn more, I recommend reading Michiel Mulders’ article on DIDs, “the Internet’s ‘missing identity layer’.” The DID working technical specification is being developed by the W3C. And those looking for code and community, check out the Decentralized Identity Foundation.

(While DIDs are promising, it is a nascent technology. Other options are under development. I’m using DIDs as an example of how decentralized identity might work.)

What does the future hold for self-sovereign identification? From what I saw at the Decentralized Web, I’m certain a solution will be found.

Prior to joining the Internet Archive, Jim Nelson was lead engineer and Executive Director of the Yorba Foundation, an open-source nonprofit. In the past he’s worked at XTree Company, Starlight Networks, and a whole lot of Silicon Valley startups you’ve probably never heard of. Jim also writes novels and short fiction. You can read more at j-nelson.net.

The Internet Archive’s 2019 Artists in Residency Exhibition

Posted on June 22, 2019 by Amir Saber Esfahani

Still from Meeting Mr. Kid Pix (2019) by Jeffrey Alan Scudder and Matt Doyle

The Internet Archive’s 2019 Artist in Residency Exhibition

New works by Caleb Duarte, Whitney Lynn, and Jeffrey Alan Scudder

Exhibition: June 29 – August 17, 2019

Ever Gold [Projects]
1275 Minnesota Street
Suite 105
San Francisco, CA, 94107

Hours: Tuesday – Saturday, 12-5 pm and by appointment

Ever Gold [Projects] is pleased to present The Internet Archive’s 2019 Artists in Residency Exhibition, a show organized in collaboration with the Internet Archive as the culmination of the third year of this non-profit digital library’s visual arts residency program. This year’s exhibition features work by artists Caleb Duarte, Whitney Lynn, and Jeffrey Alan Scudder.

The Internet Archive visual arts residency, organized by Amir Saber Esfahani, is designed to connect emerging and mid-career artists with the Archive’s millions of collections and to show what is possible when open access to information intersects with the arts. During this one-year residency, selected artists develop a body of work that responds to and utilizes the Archive’s collections in their own practice.

Building on the Internet Archive’s mission to preserve cultural heritage artifacts, artist Caleb Duarte’s project focuses on recording oral histories and preserving related objects. Duarte’s work is intentionally situated within networks peripheral to the mainstream art world in order to establish an intimate relationship with the greater public. His work is produced through situational engagement with active sites of social and cultural resistance and strives to extend the expressions of marginalized communities through a shared authorship.

During his residency at the Internet Archive, Duarte visited communities in temporary refugee camps that house thousands of displaced immigrants in Tijuana, Mexico. By recording oral histories and producing sculptural objects, participants exercised their ability to preserve their own histories, centered around the idea of home as memory; the objects come to represent such a place. Using the Internet Archive, Duarte was able to preserve these powerful stories of endurance and migration that otherwise might be subject to the ongoing processes of erasure. The preservation of these memories required transferring the objects and oral histories into a digital format, some of which are carefully and thoughtfully curated into the Internet Archive’s collections for the public to access. For the exhibition at Ever Gold [Projects], Caleb has created an architectural installation representing ideas of “human progress,” using the same materials from Home Depot that we use to construct our suburban homes: white walls, exposed wooden frames, and gated fences. These materials and the aesthetics of their construction form a direct visual link to the incarceration of immigrant children. This installation is juxtaposed with raw drawings on drywall and video documentation of sculptural performances and interviews created at the temporary refugee camps in Tijuana.

Artist Whitney Lynn’s project builds on previous work in which Lynn questions representations of the archetypal temptress or femme fatale. This type of character is the personification of a trap, a multifaceted idea that interests Lynn. Many of her recent projects are influenced by the potential of an object designed to confuse or mislead. For her residency at the Internet Archive, Lynn has turned her attention to the ultimate femme fatale—the mythological siren. Taking advantage of the Archive’s catalog of materials, Lynn tracks the nature of the siren’s depiction over time. From their literary appearance in Homer’s Odyssey (where they are never physically described), to ancient Greek bird-creatures (occasionally bearded and often featured on funerary objects), to their current conflation with mermaids, sirens have been an object for much projection. Around the turn of the century, topless mermaids begin to appear in Odyssey-related academic paintings, but in the Odyssey not only are the Sirens never physically described, but their lure is knowledge—they sing of the pain of war, claim that they know everything on earth, and say that whoever listens can “go on their way with greater knowledge.” In Homer’s iconic story, Odysseus’s men escape temptation and death because they stuff their ears with wax and remain blissfully ignorant, while Odysseus survives through bondage. The Internet Archive’s mission statement is to provide “universal access to all knowledge” and the myth of the siren is both a story about forbidden knowledge and an example of how images can reflect and reinforce systems of power. Lynn’s investigation of the siren brings up related questions about the lines between innocence and ignorance, and the intersections of knowledge, power, and privilege.

Programmer and digital painter Jeffrey Alan Scudder’s project centers around Kid Pix, an award-winning and influential painting app designed for children released in 1989 by Craig Hickman. The user interface of Kid Pix was revolutionary—it was designed to be intuitive (violating certain Apple guidelines to reduce dialog boxes and other unwieldy mechanics), offered unusual options for brushes and tools, and had a sense of style and humor that would prove hard to beat for competitor products. The original binaries of Kid Pix and related digital ephemera are in the collections of the Internet Archive. As part of his practice, Scudder writes his own digital drawing and painting software, and has always wanted to meet Hickman. As part of his residency with the Internet Archive, he visited Hickman at his home in Oregon. In a video directed by Matthew Doyle, Scudder and Hickman discuss software, art, and creativity. Hickman donated his collection of Kid Pix-related artifacts and ephemera to the Computer History Museum, and the exhibition will include a display of these materials alongside Scudder’s work. In addition to the video work and the selection of artifacts on view, Scudder will present a whiteboard drawing/diagram about his work with the Internet Archive.

During the exhibition, Jeffrey Alan Scudder will produce a new iteration of Radical Digital Painting, an ongoing performance project which often includes other artists. Radical Digital Painting is named after Radical Computer Music, a project by Danish artist Goodiepal, with whom Scudder has been touring in Europe over the last two years. In 2018 alone, Jeffrey gave more than 45 lecture-performances on digital painting and related topics in the United States and Europe. On July 20 at 5 pm, Radical Digital Painting presents THE BUG LOG, a project by Ingo Raschka featuring Julia Yerger and Jeffrey Alan Scudder.

Please contact info@evergoldprojects.com with any inquiries.

More about the artists:

Caleb Duarte (b. 1977, El Paso, Texas) lives and works in Fresno. Duarte is best known for creating temporary installations using construction type frameworks such as beds of dirt, cement, and objects suggesting basic shelter. His installations within institutional settings become sights for performance as interpretations of his community collaborations. Recent exhibitions include Bay Area Now 8 at Yerba Buena Center for the Arts (San Francisco, 2018); Emory Douglas: Bold Visual Language at Los Angeles Contemporary Exhibitions (2018); A Decolonial Atlas: Strategies in Contemporary Art of the Americas at Vincent Price Art Museum (Monterey Park, CA, 2017); Zapantera Negra at Fresno State University (Fresno, CA, 2016); and COUNTERPUBLIC at the Luminary (St. Louis, MO, 2015).

Whitney Lynn (b. 1980, Williams Air Force Base) lives and works between San Francisco and Seattle. Lynn employs expanded forms of sculpture, performance, photography, and drawing in her project-based work. Mining cultural and political histories, she reframes familiar narratives to question dynamics of power. Lynn’s work has been included in exhibitions at the San Francisco Museum of Modern Art; Torrance Art Museum; Yerba Buena Center for the Arts (San Francisco); RedLine Contemporary Art Center (Denver); and Exit Art (New York). She has completed project residencies at the de Young Museum (San Francisco, 2017) and The Neon Museum (Las Vegas, 2016). She has created site-responsive public art for the San Diego International Airport, the San Francisco War Memorial Building, and the City of Reno City Hall Lobby. Lynn has taught at Stanford University, the San Francisco Art Institute, and UC Berkeley, and is currently an Assistant Professor in Interdisciplinary Visual Arts at the University of Washington.

Jeffrey Alan Scudder (b. 1989, Assonet, Massachusetts) lives and works between Maine and Massachusetts. Scudder spends his time programming and making pictures. He attended Ringling College of Art & Design (BFA, 2011) and Yale School of Art (MFA, 2013). He has taught at UCLA and Parsons School for Design at The New School, and worked at the design studio Linked by Air. Recent exhibitions include drawings at 650mAh (Hove, 2018); INTENTIONS BASED ON A FUTURE WHICH HAS ALREADY HAPPENED at Naming Gallery (Oakland, CA, 2018); Radical Digital Painting at Johannes Vogt Gallery (New York, 2018); Imaginary Screenshots at Whitcher Projects (Los Angeles, 2017) drawinghomework.net Presents at February Gallery (Austin, 2017); New Dawn at Neumeister Bar-Am (Berlin, 2017); and VIDEO MIXER at Yale School of Art (New Haven, 2015). In 2018 alone, Jeffrey gave over 45 lecture-performances on digital painting and related topics in the United States and Europe. Selected recent lecture-performance venues include Weber State University (Odgen, Utah, 2019); 650mAh0 (Hove, 2018); Chaos Communication Congress (Leipzig, 2018); the ZKM Museum (Karlsruhe, Germany, 2018); Estonian Academy of Arts (Tallinn, Estonia, 2018), Bauhaus University (Weimar, Germany, 2018); and Yale School of Art (New Haven, 2018).

About the Internet Archive:

At the Internet Archive, we believe passionately that access to knowledge is a fundamental human right. Founded by Brewster Kahle with the mission to provide “Universal Access to All Knowledge,” this digital library serves as a conduit for trusted information, connecting learners with the published works of humankind. Like the internet itself, the Internet Archive is a critical part of the infrastructure delivering the power of ideas to knowledge seekers and providers. For 23 years, we have preserved now more than 45 petabytes of data, including 330 billion web pages, 3.5 million digital books, and millions of audio, video and software items, making them openly accessible to all while respecting our patrons’ privacy. Each day, more than one million visitors use or contribute to the Archive, making it one of the world’s top 300 sites. As a digital library, we seek to transform learning and research by making the world’s scholarly data and information linked, accessible and preserved forever online.

Internet Archive Partners with University of Edinburgh to Provide Historical Web Data Supporting Machine Translation

Posted on June 19, 2019 by jefferson

The Internet Archive will provide portions of its web archive to the University of Edinburgh to support the School of Informatics’ work building open data and tools for advancing machine translation, especially for low-resource languages. Machine translation is the process of automatically converting text in one language to another.

The ParaCrawl project is mining translated text from the web in 29 languages. With over 1 million translated sentences available for several languages, ParaCrawl is often the largest open collection of translations for each language. The project is a collaboration between the University of Edinburgh, University of Alicante, Prompsit, TAUS, and Omniscien with funding from the EU’s Connecting Europe Facility. Internet Archive data is vastly expanding the data mined by ParaCrawl and therefore the amount of translated sentences collected. Lead by Kenneth Heafield of the University of Edinburgh, the overall project will yield open corpora and open-source tools for machine translation as well as the processing pipeline.

Archived web data from IA’s general web collections will be used in the project. Because translations are particularly scarce for Icelandic, Croatian, Norwegian, and Irish, the IA will also use customized internal language classification tools to prioritize and extract data in these languages from archived websites in its collections.

The partnership expands on IA’s ongoing effort to provide computational research services to large-scale data mining projects focusing on open-source technical developments for furthering the public good and open access to information and data. Other recent collaborations include providing web data for assessing the state of local online news nationwide, analyzing historical corporate industry classifications, and mapping online social communities. As well, IA is expanding its work in making available custom extractions and datasets from its 20+ years of historical web data. For further information on IA’s web and data services, contact webservices at archive dot org.

Please Donate 78rpm Records to the Internet Archive’s Great 78 Project

Posted on June 17, 2019 by jeff kaplan

Good news: we have funding to preserve at least another 250,000 sides of 78rpm records, and we are looking for donations to digitize and physically preserve. We try to do a good job of digitizing and hosting the recordings and then thousands of people listen, learn, and enjoy these fabulous recordings.

If you have 78s (or other recordings) that you would like to find a good home for, please think of us — we are a non-profit and your donations will be tax-deductible, digitized for all to hear, and physically preserved. If you are interested in donating recordings of any type or appropriate books, please start with this form and we will contact you immediately

We are looking for anything we do not already have. (We are finding 80% duplication rates sometimes, so we are trying to find larger or more niche collections). We will physically preserve all genres, but our current funding has directed us to prioritize digitization of non-classical and non-opera.

We can pay for packing and shipping, and are getting better at the logistics for collections of a few thousand and up. These are fragile objects and we are having good luck avoiding damage.

The collections get highlighted and if you submit a story we will post it prominently. For instance: Boston Public Library, Daniel McNeil and Tina Argumedo’s Argentinian Tango collection.

The reason to highlight the donors is twofold: one is the celebrate the donor and their story, but the other is to help contextualize these recordings for different generations. These stories help users find meaning in the materials and find things they want to listen to. This way we can lead new listeners to love this music as the original collectors have

Working together we can broaden this collection to works from around the world and different cultural groups in each country.

If you are a private individual or an institution and have records to contribute, even if they are not 78s, please start with this simple form, or email info@archive.org, or call +1-415-561-6767 and we will contact you immediately. Thank you.

Internet Archive Blogs

A blog from the team at archive.org