On Wednesday, Colorado expanded the scope of its privacy law initially designed to protect biometric data like fingerprints or face images to become first in the nation to also shield sensitive neural data.
That could stop companies from hoarding brain activity data without residents realizing the risks. The New York Times reported that neural data is increasingly being collected and sold nationwide. And after a market analysis showed that investments in neurotechnology leapt by 60 percent globally from 2019 to 2020—and were valued at $30 billion in 2021—Big Tech companies have significantly intensified plans to develop their own products to rake in potentially billions.
For instance, in 2023, Meta demoed a wristband with a neural interface used to control its smart glasses and unveiled an AI system that could be used to decode the mind. In January, Elon Musk announced that Neuralink implanted its first brain chip in a human that can be used to control a device with their thoughts. And just last month, Apple Insider reported that "Apple is working on technology that could turn the Apple Vision Pro into a brainwave reader to improve mental health, assist with training and workouts, and help with mindfulness."
It's easy to get the impression that Discord chat messages are ephemeral, especially across different public servers, where lines fly upward at a near-unreadable pace. But someone claims to be catching and compiling that data and is offering packages that can track more than 600 million users across more than 14,000 servers.
Joseph Cox at 404 Media confirmed that Spy Pet, a service that sells access to a database of purportedly 3 billion Discord messages, offers data "credits" to customers who pay in bitcoin, ethereum, or other cryptocurrency. Searching individual users will reveal the servers that Spy Pet can track them across, a raw and exportable table of their messages, and connected accounts, such as GitHub. Ominously, Spy Pet lists more than 86,000 other servers in which it has "no bots," but "we know it exists."
An example of Spy Pet's service from its website. Shown are a user's nicknames, connected accounts, banner image, server memberships, and messages across those servers tracked by Spy Pet. [credit: Spy Pet ]
As Cox notes, Discord doesn't make messages inside server channels, like blog posts or unlocked social media feeds, easy to publicly access and search. But many Discord users many not expect their messages, server memberships, bans, or other data to be grabbed by a bot, compiled, and sold to anybody wishing to pin them all on a particular user. 404 Media confirmed the service's function with multiple user examples. Private messages are not mentioned by Spy Pet and are presumably still secure.
Electricity supply is becoming the latest chokepoint to threaten the growth of artificial intelligence, according to leading tech industry chiefs, as power-hungry data centers add to the strain on grids around the world.
Billionaire Elon Musk said this month that while the development of AI had been “chip constrained” last year, the latest bottleneck to the cutting-edge technology was “electricity supply.” Those comments followed a warning by Amazon chief Andy Jassy this year that there was “not enough energy right now” to run new generative AI services.
Amazon, Microsoft, and Google parent Alphabet are investing billions of dollars in computing infrastructure as they seek to build out their AI capabilities, including in data centers that typically take several years to plan and construct.
Si vous faites du streaming ou que vous jouez en ligne, vous connaissez probablement RNNoise qui supprime les bruits parasites qui gâchent vos enregistrement ou vos parties de CS, grâce à la magie des réseaux neuronaux. La bonne nouvelle du jour c’est qu’une nouvelle version vient de sortir.
Pour ceux qui ne connaîtraient pas, RNNoise c’est une bibliothèque open source développée par les petits génies de Xiph.Org et Mozilla qui utilise un modèle de réseau neuronal récurrent pour filtrer le bruit en temps réel, tout en préservant la qualité de la voix.
Et les nouveautés sont cool :
RNNoise ne se limite pas à éliminer les bruits lors de vos visioconférences. Cette merveille peut aussi améliorer la reconnaissance vocale, le traitement musical et bien d’autres tâches. En plus de la voix la plus probable, le modèle indique aussi la fiabilité de son estimation, ce qui est très utile pour la reconnaissance automatique de la parole. Mais ce n’est pas le seul facteur qui entre en jeu : les caractéristiques du locuteur, les modèles de langage et les techniques de traitement du signal ont aussi leur importance.
Pour le tester, c’est tout simple :
Commencez par cloner le dépôt GitHub de RNNoise
git clone https://github.com/xiph/rnnoise.git
Puis compilez le machin en lançant ces commandes
./autogen.sh
./configure
make
Pro tip : utilisez l’option -march=native
dans vos CFLAGS
pour profiter à fond des optimisations AVX2 !
Vous pouvez maintenant tester RNNoise sur un fichier audio brut en 16 bits / 48 kHz
./examples/rnnoise_demo input.pcm output.pc
Et voilà, votre audio ressortira propre comme un sou neuf, débarrassé de tous les bruits indésirables. Vous m’en direz des nouvelles !
Si vous voulez creuser le sujet, je vous recommande de jeter un oeil aux benchmarks RNNoise sur OpenBenchmarking. Vous verrez que c’est loin d’être une solution gadget : sur un bon gros CPU, on arrive à traiter 60 fois le temps réel ! De quoi livestreamer sur Twitch en toute sérénité. D’ailleurs, c’est marrant de voir que RNNoise cartonne aussi sur des architectures exotiques comme les puces POWER ou ARM. Les développeurs ont vraiment fait du bon boulot pour rendre leur code portable. Respect ! 🙌
Bon allez, je ne vais pas vous bassiner plus longtemps et je vous invite à consulter l’excellent article de Jean-Marc Valin. C’est fascinant de voir comment on peut exploiter le deep learning pour améliorer des algorithmes de traitement du signal traditionnels.
An accomplished and prominent transplant surgeon in Texas allegedly falsified patient data in a government transplant waiting list, which may have prevented his own patients from receiving lifesaving liver transplants, according to media reports and hospital statements.
Memorial Hermann-Texas Medical Center halted its liver transplant program on April 3 after finding "irregularities" with donor acceptance criteria, the Houston Chronicle reported based on a statement from the hospital. At the time there were 38 patients on the hospital's wait list for a liver. Earlier this week, the hospital also halted its kidney transplant program, telling the Chronicle that it was pausing operations to "evaluate a new physician leadership structure."
Memorial Hermann has not named the surgeon behind the "inappropriate changes," but The New York Times identified him as Dr. Steve Bynon, a surgeon who has received numerous accolades and, at one point, appears to have been featured on a billboard. Bynon oversaw both the liver and kidney transplant programs at Memorial Hermann.
To settle a class-action dispute over Chrome's "Incognito" mode, Google has agreed to delete billions of data records reflecting users' private browsing activities.
In a statement provided to Ars, users' lawyer, David Boies, described the settlement as "a historic step in requiring honesty and accountability from dominant technology companies." Based on Google's insights, users' lawyers valued the settlement between $4.75 billion and $7.8 billion, the Monday court filing said.
Under the settlement, Google agreed to delete class-action members' private browsing data collected in the past, as well as to "maintain a change to Incognito mode that enables Incognito users to block third-party cookies by default." This, plaintiffs' lawyers noted, "ensures additional privacy for Incognito users going forward, while limiting the amount of data Google collects from them" over the next five years. Plaintiffs' lawyers said that this means that "Google will collect less data from users’ private browsing sessions" and "Google will make less money from the data."
AT&T reset passcodes for millions of customers after acknowledging a massive leak involving the data of 73 million current and former subscribers.
"Based on our preliminary analysis, the data set appears to be from 2019 or earlier, impacting approximately 7.6 million current AT&T account holders and approximately 65.4 million former account holders," AT&T said in an update posted to its website on Saturday.
An AT&T support article said the carrier is "reaching out to all 7.6 million impacted customers and have reset their passcodes. In addition, we will be communicating with current and former account holders with compromised sensitive personal information." AT&T said the leaked information varied by customer but included full names, email addresses, mailing addresses, phone numbers, Social Security numbers, dates of birth, AT&T account numbers, and passcodes.
Redis, a tremendously popular tool for storing data in-memory rather than in a database, recently switched its licensing from an open source BSD license to both a Source Available License and a Server Side Public License (SSPL).
The software project and company supporting it were fairly clear in why they did this. Redis CEO Rowan Trollope wrote on March 20 that while Redis and volunteers sponsored the bulk of the project's code development, "the majority of Redis’ commercial sales are channeled through the largest cloud service providers, who commoditize Redis’ investments and its open source community." Clarifying a bit, "cloud service providers hosting Redis offerings will no longer be permitted to use the source code of Redis free of charge."
Clarifying even further: Amazon Web Services (and lesser cloud giants), you cannot continue reselling Redis as a service as part of your $90 billion business without some kind of licensed contribution back.
L'efficacité énergétique est-elle incompatible avec nos usages exponentiels du web ? Non, bien au contraire. Les datacenters au cœur du réseau ont tout intérêt à optimiser leur fonctionnement. À l’image du fournisseur de services cloud Infomaniak, qui tente de changer la donne. Leurs équipes mettent d’importants moyens dans des datacenters nouvelle génération. Numerama est allé visiter celui de la coopérative de la Bistoquette, à côté de Genève.
On Monday, Florida became the first state to ban kids under 14 from social media without parental permission. It appears likely that the law—considered one of the most restrictive in the US—will face significant legal challenges, however, before taking effect on January 1.
Under HB 3, apps like Instagram, Snapchat, or TikTok would need to verify the ages of users, then delete any accounts for users under 14 when parental consent is not granted. Companies that "knowingly or recklessly" fail to block underage users risk fines of up to $10,000 in damages to anyone suing on behalf of child users. They could also be liable for up to $50,000 per violation in civil penalties.
In a statement, Florida governor Ron DeSantis said the "landmark law" gives "parents a greater ability to protect their children" from a variety of social media harm. Florida House Speaker Paul Renner, who spearheaded the law, explained some of that harm, saying that passing HB 3 was critical because "the Internet has become a dark alley for our children where predators target them and dangerous social media leads to higher rates of depression, self-harm, and even suicide."
Saviez vous que les fichiers Parquet se prenaient pour des bombes ? Alors pas des bombes latines mais plutôt des bombes zip.
Alors, pour ceux qui débarquent de la planète Mars, il faut savoir que Parquet est devenu le format de prédilection pour échanger des données tabulaires. Très utilisé dans tout ce qui est Big Data et qui met une claque à ce bon vieux CSV tout pourri, Parquet, c’est binaire, c’est colonnaire, c’est compressé, c’est top !
Mais attention, derrière cette apparente perfection se cache un danger mortel pour vos disques durs et autres SSD ! En effet, même un fichier Parquet parfaitement valide peut mettre un sacré bordel et faire planter tous vos services.
Comment ? Et bien simplement avec ce fichier de seulement 42 Ko qui contient… tenez-vous bien… plus de 4 PÉTAOCTETS de données !! Oui, on parle bien de 4 millions de gigaoctets dans un malheureux fichier de 42 Ko, fallait oser.
On appelle ça une bombe de décompression ! Alors comment ça fonctionne ?
Eh bien c’est grâce à un petit tour de passe-passe démoniaque appelé « encodage par dictionnaire« . En gros, on lui donne un dictionnaire avec une seule valeur, et ensuite on fait référence à cette valeur en boucle, environ 2 milliards de fois. Résultat, on obtient un fichier minuscule car compressable au maximum mais qui une fois dézippé représente une table monstrueusement gigantesque.
C’est subtil… mais c’est vicieux ! 😈
Imaginez un peu le carnage si vous balancez ce fichier innocent dans votre pipeline Big Data sans faire gaffe… Boom ! 💥 Plantage général, crash systémique, apocalypse nucléaire ! Vos services vont tenter de lire ce fichier en pensant que c’est un gentil petit fichier Parquet de rien du tout, et là… Surprise ! C’est le chaos total. Votre cluster va fondre comme neige au soleil en essayant d’avaler ces pétaoctets de données.
Morale de l’histoire, faites attention à tout, même à ce que vous dézippez.
Et si vous avez un peu de place sur votre disque dur, vous pouvez toujours tenter l’aventure en téléchargeant 42.zip ici. (NON, NE DEZIPPEZ PAS CE TRUC !! MAUVAISE IDEE !!) (le mot de passe du zip est : 42)
Mozilla's Monitor Plus, a service launched by the privacy-minded tech firm in February, notes on its pitch page that there is "a $240 billion industry of data brokers selling your private information for profit" and that its offering can "take back your privacy."
Mozilla's most recent move to protect privacy has been to cut out one of the key providers of Monitor Plus' people-search protections, Onerep. That comes after reporting from security reporter Brian Krebs, who uncovered Onerep CEO and founder Dimitri Shelest as the founder of "dozens of people-search services since 2010," including one, Nuwber, that still sells the very kind of "background reports" that Monitor Plus seeks to curb.
Shelest told Krebs in a statement (PDF) that he did have an ownership stake in Nuwber, but that Nuwber has "zero cross-over or information-sharing with Onerep" and that he no longer operates any other people-search sites. Shelest admitted the bad look but said that his experience with people search gave Onerep "the best tech and team in the space."
After public outcry, General Motors has decided to stop sharing driving data from its connected cars with data brokers. Last week, news broke that customers enrolled in GM's OnStar Smart Driver app have had their data shared with LexisNexis and Verisk.
Those data brokers in turn shared the information with insurance companies, resulting in some drivers finding it much harder or more expensive to obtain insurance. To make matters much worse, customers allege they never signed up for OnStar Smart Driver in the first place, claiming the choice was made for them by salespeople during the car-buying process.
Now, in what feels like an all-too-rare win for privacy in the 21st century, that data-sharing deal is no more.
Researchers have unearthed never-before-seen wiper malware tied to the Kremlin and an operation two years ago that took out more than 10,000 satellite modems located mainly in Ukraine on the eve of Russia’s invasion of its neighboring country.
AcidPour, as researchers from security firm Sentinel One have named the new malware, has stark similarities to AcidRain, a wiper discovered in March 2022 that Viasat has confirmed was used in the attack on its modems earlier that month. Wipers are malicious applications designed to destroy stored data or render devices inoperable. Viasat said AcidRain was installed on more than 10,000 Eutelsat KA-SAT modems used by the broadband provider seven days prior to the March 2022 discovery of the wiper. AcidRain was installed on the devices after attackers gained access to the company’s private network.
Sentinel One, which also discovered AcidRain, said at the time that the earlier wiper had enough technical overlaps with malware the US government attributed to the Russian government in 2018 to make it likely that AcidRain and the 2018 malware, known as VPNFilter, were closely linked to the same team of developers. In turn, Sentinel One’s report Thursday noting the similarities between AcidRain and AcidPour provides evidence that AcidPour was also created by developers working on behalf of the Kremlin.
Few Ars readers will have been surprised by the news from last week concerning General Motors' connected cars. As The New York Times reported, some owners of vehicles made by General Motors have been having a hard time getting car insurance. The reason? They unwittingly agreed to share their driving data with a third party. Now, at least one driver is suing. If more follow suit, this could be the push the industry needs to do better.
The heart of the problem is one of GM's OnStar connected-car services, called Smart Driver. We've tested it out in the past—it monitors things like how fast you drive, how hard you accelerate and brake, how often you drive at night, and your fuel economy, then uses that data to generate a numerical score from 0 to 100, with a higher number indicating that you're a safer driver.
These kinds of services can be useful—most people think they're great drivers until they start getting independent feedback. And the data that Smart Driver collects really can help you drive more economically and with less risk. But as I noted at the time, I was glad my insurance rates weren't at risk via data sharing with an insurer.
Glassdoor, where employees go to leave anonymous reviews of employers, has recently begun adding real names to user profiles without users' consent, a Glassdoor user named Monica was shocked to discover last week.
"Time to delete your Glassdoor account and data," Monica, a Midwest-based software professional, warned other Glassdoor users in a blog. (Ars will only refer to Monica by her first name so that she can speak freely about her experience using Glassdoor to review employers.)
Monica joined Glassdoor about 10 years ago, she said, leaving a few reviews for her employers, taking advantage of other employees' reviews when considering new opportunities, and hoping to help others survey their job options. This month, though, she abruptly deleted her account after she contacted Glassdoor support to request help removing information from her account. She never expected that instead of removing information, Glassdoor's support team would take the real name that she provided in her support email and add it to her Glassdoor profile—despite Monica repeatedly and explicitly not consenting to Glassdoor storing her real name.
Vous pensiez qu’Apple se contentait de suivre les autres géants de la tech en matière d’intelligence artificielle ? Détrompez-vous ! La firme de Cupertino vient de dévoiler les secrets de son nouveau modèle d’IA baptisé MM1, et croyez-moi, ça décoiffe grave !
Alors que Google mise sur son modèle Gemini pour booster les fonctionnalités IA d’iOS 18, Apple a décidé de tracer sa propre route avec MM1. L’idée de génie ? Utiliser un dataset diversifié qui mélange allègrement du texte et des images pour entraîner cette IA nouvelle génération.
Résultat, MM1 est capable de générer des légendes pour des images de manière hyper précises, de répondre à des questions posées sur des images et même d’inférer du langage naturel à partir d’indices linguistiques et visuels. Une vraie bête de compétition !
En combinant différentes méthodes d’entraînement issues d’autres IA avec ses propres techniques maison, Apple espère ainsi rattraper son retard sur des mastodontes comme Google ou OpenAI. Et vu le niveau de performances annoncé, y a de quoi être optimiste !
Alors comment ça fonctionne ?
Et bien si vous montrez une photo de votre chat à MM1, non seulement il sera capable de le reconnaître et de le décrire avec une précision redoutable, mais il pourra aussi répondre à des questions du genre « De quelle couleur est son collier ? » ou « A-t-il l’air joueur ou paresseux sur cette image ?« .
Dans l’exemple réel ci-dessous, on lui demande combien il devra payer pour les bières (photo 1) d’après le menu (photo 2). Et c’est le seul à répondre correctement, et précisemment.
Bluffant, non ?
Et ce n’est qu’un exemple parmi d’autres. Apple étant Apple, on peut s’attendre à ce que MM1 révolutionne notre façon d’interagir avec nos devices. Pourquoi pas imaginer une app qui génère automatiquement la description textulle d’une scène que vous auriez photographié ? Ou un mode « sous-titres » universel qui retranscrirait en temps réel tout ce que votre iPhone voit et entend ? Les possibilités sont infinies dès que l’IA est embarquée !
Bien sûr, tout cela n’est encore qu’à l’état de recherche pour le moment. Mais connaissant Apple, on peut parier que la firme mettra rapidement en application les promesses de MM1 dans ses futurs produits et services. iOS 19 propulsé par une IA multimodale surpuissante, ça envoie du rêve, je ne vous le cache pas.
Avec MM1, Apple prouve une fois de plus sa capacité à innover. Pendant que les autres géants de la Silicon Valley se contentent d’améliorer leurs modèles existants, la marque à la pomme préfère partir d’une feuille blanche pour inventer l’IA de demain. Comme dirait l’autre, « think different », ça a du bon ! 😎
Alors, vous en pensez quoi de ce MM1 ? Hâte de voir ce qu’Apple nous réserve pour la suite.
Perso, j’ai déjà hâte de discuter avec mon iPhone comme s’il était mon meilleur pote. Au moins, j’aurais un pote ^^.
Meta is considering cutting monthly subscription fees for Facebook and Instagram users in the European Union nearly in half to comply with the Digital Market Act (DMA), Reuters reported.
During a day-long public workshop on Meta's DMA compliance, Meta's competition and regulatory director, Tim Lamb, told the European Commission (EC) that individual subscriber fees could be slashed from 9.99 euros to 5.99 euros. Meta is hoping that reducing fees will help to speed up the EC's process for resolving Meta's compliance issues. If Meta's offer is accepted, any additional accounts would then cost 4 euros instead of 6 euros.
Lamb said that these prices are "by far the lowest end of the range that any reasonable person should be paying for services of these quality," calling it a "serious offer."