Certainly the biggest news of the past month has been a continuation of the trend towards regulating the biggest players in the tech industry. The US House of Representatives is considering 5 antitrust bills that would lead to major changes in the way the largest technology companies do business; and the Biden administration has appointed a new Chair of the Federal Trade Commission who will be inclined to use these regulations aggressively. Whether these bills pass in their current form, how they are challenged in court, and what changes they will lead to is an open question. (Late note: Antitrust cases against Facebook by the FTC and state governments based on current law were just thrown out of court.)
Aside from that, we see AI spreading into almost every area of computing; this list could easily have a single AI heading that subsumes programming, medicine, security, and everything else.
AI and Data
- A new algorithm allows autonomous vehicles to locate themselves using computer vision (i.e., without GPS) regardless of the season; it works even when the terrain is snow-covered.
- An AI-based wildfire detection system has been deployed in Sonoma County. It looks for smoke plumes, and can monitor many more cameras than a human.
- Researchers are investigating how racism and other forms of abuse enter AI models like GPT-3, and what can be done to prevent their appearance in the output. It’s essential for AI to “understand” racist content, but equally essential for it not to generate that content.
- Google has successfully used Reinforcement Learning to design the layout for the next generation TPU chip. The layout process took 6 hours, and replaced weeks of human effort. This is an important breakthrough in the design of custom integrated circuits.
- Facebook has developed technology to identify the source from which deepfake images originate. “Fingerprints” (distortions in the image) make it possible to identify the model that generated the images, and possibly to track down the creators.
- Adaptive mood control is a technique that autonomous vehicles can use to detect passengers’ emotions and drive accordingly, making it easier for humans to trust the machine. We hope this doesn’t lead AVs to drive faster when the passenger is angry or frustrated.
- IBM has developed Uncertainty Quantification 360, a set of open source tools for quantifying the uncertainty in AI systems. Understanding uncertainty is a big step towards building trustworthy AI and getting beyond the idea that the computer is always right. Trust requires understanding uncertainty.
- Waymo’s autonomous trucks will begin carrying real cargo between Houston and Fort Worth, in a partnership with a major trucking company.
- GPT-2 can predict brain activity and comprehension in fMRI studies of patients listening to stories, possibly indicating that in some way its processes correlate to brain function.
- GPT-J is a language model with performance similar to GPT-3. The code and weights are open source.
- It appears possible to predict preferences directly by comparing brain activity to activity of others (essentially, brain-based collaborative filtering). A tool for advertising or for self-knowledge?
- Features stores are tools to automate building pipelines to deliver data for ML applications in production. Tecton, which originated with Uber’s Michelangelo, is one of the early commercial products available.
- How does machine learning work with language? Everything You Ever Said doesn’t answer the question, but lets you play with an NLP engine by pasting in a text, then adding or subtracting concepts to see how the text is transformed. (Based on GLoVE, a pre-GPT model.)
- The HateCheck dataset tests the ability of AI applications to detect hate speech correctly. Hate speech is a hard problem; being too strict causes systems to reject content that shouldn’t be classified as hate speech, while being too lax allows hate speech through.
- Twitter has built a data ethics group aimed at putting ethics into practice, in addition to research. Among others, the group includes Rumman Chowdhury and Kristian Lum.
- A study of the effect of noise on fairness in lending shows that insufficient (hence noisier) data is as big a problem as biased data. Poor people have less credit history, which means that their credit scores are often inaccurate. Correcting problems arising from noise is much more difficult than dealing with problems of bias.
- Andrew Ng’s newsletter, The Batch, reports on a survey of executives that most companies are not practicing “responsible AI,” or even understand the issues. There is no consensus about the importance (or even the meaning) of “ethics” for AI.
- Using AI to screen resumes is a problem in itself, but AI doing the interview? That’s taking problematic to a new level. It can be argued that AI, when done properly, is less subject to bias than a human interviewer, but we suspect that AI interviewers present more problems than solutions.
- WebGPU is a proposal for a standard API that makes GPUs directly accessible to web pages for rendering and computation.
- An end to providing cookie consent for every site you visit? The proposed ADPC (advanced data protection control) standard will allow users to specify privacy preferences once.
- Using social media community guidelines as a political weapon: the Atajurt Kazakh Human Rights channel, which publishes testimonies from people imprisoned in China’s internment camps, has been taken down repeatedly as a result of coordinated campaigns.
- Microsoft is working on eliminating passwords! Other companies should take the hint. Microsoft is stressing biometrics (which have their own problems) and multi-factor authentication.
- Supply chain security is very problematic. Microsoft admits to an error in which they mistakenly signed a device driver that was actually a rootkit, causing security software to ignore it. The malware somehow slipped through Microsoft’s signing process.
- Markpainting is a technology for defeating attempts to create a fake image by adding elements to the picture that aren’t visible, but that will become visible when the image is modified (for example, to eliminate a watermark).
- Amazon Sidewalk lets Amazon devices connect to other open WiFi nets to extend their range and tap others’ internet connections. Sidewalk is a cool take on decentralized networking. It is also a Very Bad Idea.
- Authentication using gestures, hand shapes, and geometric deep learning? I’m not convinced, but this could be a viable alternative to passwords and crude biometrics. It would have to work for people of all skin colors, and that has consistently been a problem for vision-based products.
- According to Google, Rowhammer attacks are gaining momentum–and will certainly gain even more momentum as feature sizes in memory chips get smaller. Rowhammer attacks repeatedly access a single row in a memory chip, hoping to corrupt adjacent bits.
- While details are sketchy, the FBI was able to recover the BTC Colonial Pipeline paid to Darkside to restore systems after their ransomware attack. The FBI has been careful to say that they can’t promise recovering payments in other cases. Whether this recovery reflects poor opsec on the part of the criminals, or that Bitcoin is more easily de-anonymized than most people think, it’s clear that secrecy and privacy are relative.
Design and User Experience
- Communal Computing is about designing devices that are inherently shared: home assistants, home automation, and more. The “single account/user” model doesn’t work.
- A microphone that only “hears” frequencies above the human hearing range can be used to detect human activities (for example, in a smart home device) without recording speech.
- Digital Twins in aerospace at scale: One problem with the adoption of digital twins is that the twin is very specific to a single device. This research shows that it’s possible to model real-world objects in ways that can be reused across collections of objects and different applications.
- The Open Insulin Foundation is dedicated to creating the tools necessary to produce insulin at scale. This is the next step in a long-term project by Anthony DiFranco and others to challenge the pharma company’s monopoly on insulin production, and create products at a small fraction of the price.
- Where’s the work on antivirals and other treatments for COVID-19? The answer is simple: Vaccines are very profitable. Antivirals aren’t. This is a huge, institutional problem in the pharmaceutical industry.
- The National Covid Cohort Collaborative (N3C) is a nationwide database of anonymized medical records of COVID patients. What’s significant isn’t COVID, but that N3C is a single database, built to comply with privacy laws, that’s auditable, and that’s open for any group to make research proposals.
- Can medical trials be sped up by re-using control data (data from patients who were in the control group) from previous trials? Particularly for rare and life-threatening diseases, getting trial volunteers is difficult because nobody wants to be assigned to the control group.
- A remote monitoring patch for COVID patients uses AI to understand changes in the patient’s vital signs, allowing medical staff to intervene immediately if a patient’s condition worsens. Unlike most such devices, it was trained primarily on Black and Hispanic patients.
- Machine learning in medicine is undergoing a credibility crisis: poor data sets with limited diversity lead to poor results.
- Microsoft, OpenAI, and GitHub have announced a new service called Copilot that uses AI to make suggestions to programmers as they are writing code (currently in “technical preview”). It is truly a cybernetic pair programmer.
- Windows 11 will run Android apps. If nothing else, this is a surprise. Android apps will be provided via the Amazon store, not Google Play.
- Microsoft’s PowerFx is a low-code programming language based on Excel formulas (which now include lambdas). Input and output are through what looks like a web page. What does it mean to strip Excel from its 2D grid? Is this a step forward or backward for low code computing?
- Open Source Insights is a Google project for investigating the dependency chain of any open source project. Its ability currently is limited to a few major packaging systems (including npm, Cargo, and maven), but it will be expanded.
- Quantum computing’s first application will be in researching quantum mechanics: understanding the chemistry of batteries, drugs, and materials. In these applications, noise is an asset, not a problem.