Lavanya Singh was eager to write lots of code after her freshman year of college, but she knew it was hard to find a place that would give her a chance. Then she landed a spot with the Google Summer of Code (GSoC) program working at the Internet Archive.
Paired with Mark Graham, director of The Wayback Machine, Singh was asked to create a systematic way to archive news sources from all around the world.
“Mark basically gave me that problem and said: ‘Go figure it out,’” she recalls, grateful for the challenge, the tight knit community at the Internet Archive, and the mentorship provided throughout the project. “The Internet Archive really trusts their interns and gives you an opportunity to do huge scale technical projects that are going to be useful in the long run.”
The experience gave Singh skills and confidence that led to other internships and a job as a software engineer, following graduation this spring from Harvard University with a degree in computer science and philosophy.
For 17 years, GSoC has given more than 18,000 students from 112 countries the chance to learn about programming up close. Google selects students (called “contributors”) and matches them with organizations doing open-source projects. All told, the students have created 40 million lines of code since the program’s inception in 2005. It has helped launch careers, like Singh’s, and provided a pipeline of potential employees for the 746 organizations that have participated. Google recently posted its Google Summer of Code timeline for 2022 for applicants for the paid positions, which last 12 weeks.
“It is truly a benefit and service to students. For some, it can be transformational,” said Singh’s mentor, Graham, of the Internet Archive. “But it also helps us. It’s a way to learn about new talent. And it’s a way for the Internet Archive to increase our visibility and demonstrate that we are part of this community of organizations.”
GSoC provides an infrastructure to match promising programmers with projects that can be difficult to find and is especially relevant now with people working remotely, said Brenton Cheng, a senior engineer with the Internet Archive.
“It’s been an incredible way by which people all over the world can get opportunities to work with companies, creating openings that might not be available to them otherwise,” said Cheng, who has mentored several student contributors over the years.
Staff assign mini-projects designed to give students hands-on experience and a sense of accomplishment. Students are also included in team meetings, invited to give input and present their work, said Cheng.
Recent GSoC projects and contributors:
- Rakesh Chinta focused on building advanced features for the existing Chrome extension for the Wayback Machine (2017);
- Zhengyue Cheng created a “map” of the web via the Wayback Machine (2018);
- Salman Shah worked with the Open Library team to modernize and increase the coverage of its book catalog and improve website reliability (2018);
- Kanchan Joshi improved site navigation for Archive.org (2019);
- Giacomo Cignoni made a significant contribution with his BookReader Selection & Dark Mode project. He worked to give public domain works the ability to have text selection over the book page images (2020);
- Tabish Shaikh helped improve the adoption of Open Library with his Adoption of BookLovers project – redesigning the Book Page and making it clearer what services were offered (2020);
- Nolan Windham worked on the Open Book Genome Project. It centered on the ability for computers and machines to read a book on our behalf, and extract metadata that can then be made publicly useful to the world. Through the process, nearly 10,000 new books were added to the lending system (2021);
- Xin Yue Chen focused on linking Wikipedia references to Internet Archive books (2021).
“We’re helping to train the next generation of developers,” Cheng said. “On the flip side, we really believe in our mission. Quite often, the people who work with the Google Summer of Code program continue to contribute with us as volunteers or sometimes even become employees.”
It’s a mutual win and an awesome program that has helped a lot of students find connections with companies, added Cheng. The program is a way for young people to show their initiative and is advertised as a way to “flip bits not burgers” in the summer.
“It’s a chance to contribute to a larger organization and maybe set themselves on a different prospective path to their future,” Cheng said.
Mek, who leads the OpenLibrary team at the Internet Archive, said the four GSoC students he’s worked with have made substantial improvements through their projects.
“We were able to make progress in a variety of different areas that we may not otherwise have had the bandwidth to focus on,” said Mek.
Being involved in GSoC has dramatically increased the number of volunteers who are interested in participating within the Open Library ecosystem. It prompted the Internet Archive to streamline the volunteer page and create an intake form. There has also been an effort to organize and label projects for new volunteers.
The GSoC experience led the Internet Archive to structure its own internship and fellowship opportunities. And it has provided the organization with a means to find qualified staff.
Anish Kumar Sarangi, a student GSoC contributor in 2018, joined the Internet Archive as an employee in May 2020. During his summer experience, Sarangi worked on development of the Chrome extension, “Wayback Machine.” Today it is used by thousands of people to help them archive URLs, access archived content from broken links and perform other functions to help make the web more useful and reliable.
“I gained a lot of knowledge and experience. Everyone was very encouraging and supportive,” said Sarangi, of the summer program. He now works from India in software development for the Internet Archive and has been a mentor with the program himself. His advice to others considering applying: “Please get involved in the community. You can get guidance and grow further in the organization.”