This week we begin with an article that talks about why the data on COVID-19 in the United States is unreliable. Then, we have a story on the removal of 2.7 billion “bad ads” by Google, for violation of the company’s ad policies. The next piece is on synthesis of data for building effective Artificial Intelligence Systems. The following story is on scammers in the United States using fake websites for coronavirus stimulus payments in an attempt to steal data and money. After this, we cover a database exposure at a French daily newspaper, that leaked 7.6 billion records containing critical information of reporters and employees. Lastly, we have included an article on how banks plan to assess individual credit applications by leveraging alternative data, post COVID-19.
Coronavirus Data in the U.S. Is Terrible, and Here’s Why
We have more data than ever to track a growing number of coronavirus cases, tests, and deaths. But can we rely on these numbers? Every day now comes with a new set of coronavirus data: numbers for positive tests, negative tests, deaths, patients hospitalized, ventilator shortfalls and hospital beds occupied. And, more rarely, the racial and ethnic breakdown of those who have tested positive, and those who have died.
Google removed 2.7 billion bad ads, nearly 1 million ad accounts in 2019
This year,the company says it has removed “tens of millions” of COVID-19 related ads. Last year, Google says it took down 2.7 billion so-called bad ads for violating the company’s ad policies, according to its annual report released Thursday. That’s up from the 2.3 billion bad ads Google reported taking down in 2018. The number of ad accounts Google terminated remained relatively flat from the previous year at nearly one million.
Why use synthetic data?
Artificial intelligence: it’s the “magic” that can solve every business problem imaginable. Except when it can’t. Often, even where AI systems could provide revolutionary solutions, there are practical limitations. If your AI is going to learn from data, how do you make sure it has the right amount of data and that it’s data you can use without heading straight for a legal minefield? This is where data synthesis comes in.
Scammers are using fake coronavirus stimulus payment sites to steal your money
If you’re awaiting a federal stimulus payment and you haven’t filed tax returns, beware: Hackers have set their sights — and sites — on your $1,200 check. Scammers have set up more than 180,000 coronavirus-themed websites in an attempt to steal data or misinform consumers, according to data from Checkphish by Bolster. The security firm has spotted more than 149,000 suspicious domain registrations with the term “stimulus check” in them.
Le Figaro caught out in database exposure
Le Figaro, a French daily newspaper, has been found to have inadvertently exposed roughly 7.4 billion records containing personally identifiable information (PII) of reporters and employees. As well as exposing data about its own staff, the leading French daily has also exposed the least information relating to some 42,000 users, as Bleeping Computer has reported. The breach is significant in terms of news media. Le Figaro’s site is the most visited news site in France with an audience of more than 23 million monthly unique visitors.
Why unlocking alternative data should be integral to credit models in the post-COVID-19 economy
It’s no secret that South African banks and financial services providers draw on massive volumes of data to make decisions and provide targeted services to middle and upper-income earners. When assessing individual credit applications, for example, banks will draw on highly detailed customer profiles and many years of data collection.
Source: https://mailchi.mp/zigram/data-asset-weekly-dispatch_4_may