AlephPost
Today, common wisdom is that news media is both biased and polarizing. Back in 2017, this view was controversial. Yet it was a conviction my co-founder and I deeply held, as a result of our time spent working in the news ourselves, at the Huffington Post.
Our solution to this problem was AlephPost, a news aggregator that dynamically clustered articles by topic, surfacing trends about coverage, sentiment, and viewpoint.
We aimed to help readers see beyond what their “filter bubble” was telling them, so that they could understand perspectives, fill gaps in their knowledge, and ultimately come to a more nuanced perspective—hopefully with no added effort on their part.
In addition to a news portal, we also developed a browser plugin that would inject links to other articles right beside any link shared in your feed. This allowed a savvy user to resist a rage-inducing title, either in favor of better reporting or at least some additional context in mind.
Background
From 2014-2017, Maggie Xiong and I worked at the Huffington Post. She directed the data and machine learning organization, and as the principal engineer, I pitched, developed, and managed all of our projects. (For what it’s worth, I’ve never had such a productive and joyful time working with anyone else, ever.)
We entered the company believing that by helping the Huffington Post, we were making the world a better place. At the time, HuffPost was the dark horse, democratizing the news. Yet we entered just as the industry collapsed under the weight of clickbait, social media advertising, and the triumph of opinion over reporting. We helped these trends along, in fact, introducing data and ML into every aspect of the business.
By the time Donald Trump entered the 2016 race, most engineers we spoke to had already stopped reading our paper. It had become screed. Thanks to peers in other orgs, such as the New York Times and the Washington Post, we knew it wasn’t just HuffPost either. The news often made you less informed than you began.
By the time we left, most journalists had been replaced by activists, and we were all the worse for it.
But we believed that people deserve better. So we made AlephPost.
Goals
Our primary aim was to become readers destination for the news. Rather than visit a particular outlet, why not pick from all of the options on the table?
Our secondary aim was to improve passive news consumption—as this is the way most users experience the news. The browser plugin was our first pass at this. But we hoped to get buy-in from schools, libraries, and other institutions, so that all public machines would have the plugin pre-installed.
Challenges
We very quickly addressed the primary technical challenges:
- Scrape and cluster the news.
- Provide great search.
- Surface sentiment, “factual” quotes, and “polemic” quotes.
- Surface geographic trends and political coverage.
- Offer great search.
- Offer a great social media experience.
However, our greater challenge was in getting people to use our platform.
Maggie and I did a lot of outreach, but we heard the same response again and again:
“Oh, this is great… for them.”
Our beta users truly believed that only the other side of the political spectrum was biased. That they had a clear picture of the world. That their team were the good guys. That, remarkably, 50% of the country was either evil or stupid and all of the evil or stupid people were in the other political party.
We applied for grants as well, with the aim of executing the project as a non-profit. However, we were rejected.
Lessons
Our primary failure is that our product required persuading people that they had a problem. And our secondary failure is that our solution wasn’t emotionally rewarding to the user. After all, the whole point was that a user should come away feeling that they had been wrong about something.
And yet, I suspect we primarily failed because we were ahead of our time. Years later, GroundNews launched, and even today, they look surprisingly like a feature-poor version of what we’d produced.
Tech Stack
Our tech stack skewed very heavily toward the backend. The vast majority of our work was in research and background jobs.
- Website: Server-rendered site built using Express, EJS, and PostCSS. Hosted on AWS EC2.
- Database: PostgreSQL and Elasticsearch.
- Machine Learning: Our various features were implemented using PyTorch, NLTK, and Scikit-Learn. Background jobs ran in AWS EC2.