Episode 175 — Data Provenance and Privacy: Personal Privacy and the Rise of AI

Episode 175 — Data Provenance and Privacy: Personal Privacy and the Rise of AI

Artificial intelligence is not new. But now an acronym in common usage, AI is dominating markets, politics, industry, and our attention. And its use affects personal privacy.

Let's take a couple examples. Bathsheba was the mother of Solomon in Torah and biblical days. Solomon's father was King David. tIf you ask Google's Gemini what ethnicity was Bathsheba, you'll get an answer that this is uncertain but she probably was Hebrew. We can't ask her, because she died about three millenia ago. But what does Gemini base its answer upon that indicates probably Hebrew? Asking that question is asking about data provenance. And if we were asking about a living person and who the parents of that individual is - example if adopted - that's a very private matter. Let's say the data says a major politician is the mother or father of someone else - and the individual denies it. You can see how there privacy matters are very much at stake when a dataset is being created and then used to provide information.

So what is data provenance? Think about it as the genealogy of information.

Where did the information originate? How was it used and misused and transformed? How do we know if data is reliable and trustworthy or whether it is disinformation or hallucination? This is where the provenance of a dataset comes in.

Datasets are the blood flows of AI. Garbage in garbage out - computer age. And misinformation in disinformation out . This can have major consequences as AI takes over more and more of the roles humans used to play when oral transmission and later hard copy was the way info was shared, stories were told, myths were created, science progressed. In the digital age, it all happens much faster. As AI expands, so does the risk that datasets will be unreliable and invade personal privacy of people.

So - the need for standards of data provenance, rules and standards for determining the lineage of info, the reliability of a dataset. Otherwise, AI will be misused and create harm ad invade privacy and reputations of individuals improperly.

So how are standards for the provenance of data and datasets being created?

Government regulation is one approach. But here govts are catching up with tech change. There is no gold standard, or perhaps any colorable standard, set by govt or govts about data provenance.

There are some efforts under way from the tech and business world to create data prov standards. One is by Data Provenance Initiative, powered by Cohere - aiming to track and trace data sources within data sets. Understand - almost 2000 data sets audited and traced to determine the provenance of data held within. Picture a variety of techniques to do this, so we can measure the origin, accuracy, reliability of data that feeds AI. Check out https://dataprovenance.org.

Another major effort is by Data and Trust Alliance, a nonprofit group of major western companies. Including Walmart, Deloitte, Humanan, Pfizer, Mastercard, others. Goal is to produce voluntary standards for data provenance. Think of this like voluntary standards for safety in other fields. E.g., UL label re elec - not govt. Or food safety standards. Labels about organic or level of whether fish is farm or wild, endangered or not.

We'll talk in next episode with reps of D & T Alliance about standards announced in June 2024 to cover:

Data type, source, legal rights, privacy and protection, lineage, generation date and material, intended use, and restrictions.

Privacy aspects are an essential part of setting standards. Can AI use personal data of individuals gathered without their consent? Can AI use personal identities in formulating datasets about a mass of people? How reliable is data that cannot be traced definitively back to someone who grants consent and has the ability to correct or delete such data? What happens if personally identifiable information is included in a dataset and then is used to broadcast or leak that information, which allows someone to take malicious or unwanted action against that individual. Does anyone truly have a right to be forgotten? A right not to be doxxed? A right not to be maligned because of one's identity, race, national origin, religion?

Tune in next week to Episode 176 as we dig deeper into data provenance and privacy.

The first 155 episodes of Data Privacy Detective can be found on the feed of the Frost Brown Todd Podcast. You can listen on Apple Podcasts (https://apple.co/3IrHUTg), Spotify (https://bit.ly/49XRU2k), or Soundcloud (https://bit.ly/3T8EWrw).

Denne episoden er hentet fra en åpen RSS-feed og er ikke publisert av Podme. Den kan derfor inneholde annonser.

Episoder(33)

Episode 188 — Privacy and the Big Apple: Cities and Chief Privacy Officers/CPOs

Episode 188 — Privacy and the Big Apple: Cities and Chief Privacy Officers/CPOs

Join New York City's Chief Privacy Officer, Mike Fitzpatrick. Explore the role of a city's CPO. Cities must balance the interests of personal privacy and municipal operations, while complying with ope...

16 Jan 202516min

Episode 187 — 2025 Resolution: Make it the Year of the Passkey

Episode 187 — 2025 Resolution: Make it the Year of the Passkey

The Data Privacy Detective returns from a short sabbatical to recommend a New Year's Resolution for 2025 - make this the Year of the Passkey. Data privacy best practice moved from passwords to multi-f...

2 Jan 20259min

Episode 186 — Data Privacy and Credit Bureaus: How false data and algorithms hurt people

Episode 186 — Data Privacy and Credit Bureaus: How false data and algorithms hurt people

The United States has three major credit bureaus - Experian, Equifax, and TransUnion. How they score individuals has a major impact on their lives. Credit scores can raise interest rates to double wha...

24 Okt 202417min

Episode 185 — Data privacy and law firms: How secure is confidential information shared with attorneys?

Episode 185 — Data privacy and law firms: How secure is confidential information shared with attorneys?

October is Cybersecurity Awareness Month. For our personal data this Halloween, will it be trick or treat? In Episode 185, we explore one of the most private of all U.S. organizations - the law firm -...

10 Okt 202414min

Episode 184 — September 2024 Data Privacy News

Episode 184 — September 2024 Data Privacy News

Two major data privacy developments from September 2024: a Staff Report from the FTC and California's new statute about brain data. Tune in to Episode as the Data Privacy Detective provides meaning ...

3 Okt 202417min

Episode 183 — Identity Orchestration (IO) in a Multi-Cloud Data World: Protecting Privacy by IO Architecture

Episode 183 — Identity Orchestration (IO) in a Multi-Cloud Data World: Protecting Privacy by IO Architecture

When clouds gather, we prepare for storms, sometimes hurricanes. In a data world that is increasingly multi-cloud, how can we protect data that is ever more susceptible to attack by mal-actors? Enter ...

26 Sep 202425min

Episode 182 — How to stop your car and your privacy from being cyberjacked

Episode 182 — How to stop your car and your privacy from being cyberjacked

Today's automobiles and trucks are more than transport vehicles. Filled with computer technology,cars and trucks are data collectors and transmitters - and a potential way for hackers to steal persona...

19 Sep 202420min

Episode 181 — Data Privacy Developments from August 2024

Episode 181 — Data Privacy Developments from August 2024

Tune in for our August 2024 roundtable about three hot data privacy developments. Yugo Nagashima and Brio St. Amour join the Data Privacy Detective to plumb meaning beneath the headlines: The Netherl...

5 Sep 202422min

Populært innen Teknologi

lydartikler-fra-aftenposten
romkapsel
teknisk-sett
energi-og-klima
tomprat-med-gunnar-tjomlid
elektropodden
nasjonal-sikkerhetsmyndighet-nsm
hans-petter-og-co
shifter
pedagogisk-intelligens
rss-anleggspraten
teknologi-og-mennesker
rss-snakk-om-sikkerhet
rss-plateprat
rss-ai-forklart
fornybaren
rss-digitaliseringspadden
rss-30-minutter-inn-i-fremtiden
rss-alt-som-gar-pa-strom
rss-heis