[IP] AIDA has indexed the AOL 500k User Session Collection in our Internet Measurement Data Catalog (DatCat):
Begin forwarded message:
From: Colleen Shannon <cshannon@xxxxxxxxx>
Date: August 9, 2006 9:15:53 PM EDT
To: David Farber <dave@xxxxxxxxxx>
Cc: k claffy <kc@xxxxxxxxx>, imdc-wg@xxxxxxxxx
Subject: Re: [dave@xxxxxxxxxx: [IP] more on AOL Releases Search Logs
from 500, 000 Users]
Because many folks are delving into the AOL query log data, including
people who intend to publish their results, CAIDA has indexed the AOL
500k
User Session Collection in our Internet Measurement Data Catalog
(DatCat):
http://imdc.datcat.org/collection/1-003M-5=AOL+500k+User+Session
+Collection
DatCat does not store or distribute data, so we are not providing the
AOL
collection. Rather, we provide a permanent record of the existence
of the
dataset, relevant metadata, and a permanent handle that can be used
to cite
the dataset. In the near future, anyone who has used the data will
be able
to add annotations describing the features of the dataset (and any other
dataset in the catalog).
The goals of the DatCat project include:
- to facilitate searching for and sharing of data among researchers
- to enhance documentation of datasets via a public annotation system
- to advance network science by promoting reproducible research
Anyone can browse the Catalog by visiting imdc.datcat.org. We are just
beginning to test one of several contribution interfaces, and we
encourage
anyone who would like to be notifed when datcat is available for public
contributions of data to email contribute@xxxxxxxxxx with a
description of
the data you would like to contribute.
Cheers,
The CAIDA DatCat Team
info@xxxxxxxxxx
--On Monday, August 07, 2006 10:44:59 PM -0700 k claffy <kc@xxxxxxxxx>
wrote:
----- Forwarded message from David Farber <dave@xxxxxxxxxx> -----
Date: Mon, 7 Aug 2006 13:33:03 -0400
From: David Farber <dave@xxxxxxxxxx>
Subject: [IP] more on AOL Releases Search Logs from 500,000 Users
To: ip@xxxxxxxxxxxxxx
X-Mailer: Apple Mail (2.752.2)
Begin forwarded message:
From:
Date: August 7, 2006 1:12:38 PM EDT
To: David Farber <dave@xxxxxxxxxx>
Subject: Re: [IP] AOL Releases Search Logs from 500,000 Users
== Please remove my name and e-mail if you forward this to IP. ==
A search for an SSN shaped regex on the full AOL search data returns
a 191 results including repeat searches. Many of these have full
names, and at least a dozen include either an addresses, drivers
license number, date of birth or some combination of the three in
the
same query. There's no telling how much more information an
aggregation of other queries by those same user ID would yield.
-------------------------------------
You are subscribed as kc@xxxxxxxxx
To manage your subscription, go to
http://v2.listbox.com/member/?listname=ip
Archives at:
http://www.interesting-people.org/archives/interesting-people/
----- End forwarded message -----
-------------------------------------
You are subscribed as roessler@xxxxxxxxxxxxxxxxxx
To manage your subscription, go to
http://v2.listbox.com/member/?listname=ip
Archives at: http://www.interesting-people.org/archives/interesting-people/