About our project

Note: This was Phase 1 of a two phase project. To See Phase 2, the active one currently please click here.

At Strata RX 2012, we (NotOnly Dev) released the Doctor Social Graph — a project that visually displays the connections between doctors, hospitals and other healthcare organizations in the US. This conglomeration of data set shows everything — from the connections between doctors who refer their patients to each other to any other data collected by state and national databases. It displays real names and and will eventually show every city.

This is THE data set that any academic, scientist, or health policy junkie could ever want to conduct almost any study.

Our goal is to empower the patient, make the system transparent and accountable, and release this data to the people who can use it to revitalize our health system.

Why this matters to patients

It is very difficult to fairly evaluate the quality of doctors in this country. Our State Medical Boards only go after the most outrageous doctors. The doctor review websites are generally popularity contests. Doctors with a good bedside manner do well. Doctors without strong social skills can do poorly, even if they are good doctors. It is difficult to evaluate doctors fairly. Using this data set, it should be possible to build software that evaluates doctors by viewing referrals as “votes” for each other.

This data set could be the best source of public information about the quality of doctors ever. More importantly, it should help doctors to encourage other doctors to improve their skills — for example, by seeking board certification. This data set will allow patients and administrators to evaluate the health system on both micro and macro scales and give them the tools to take steps towards addressing inefficiencies.

What we will be releasing no matter what next year.

This data set, which we got from a carefully formed FOIA request against the Medicare claims database, shows how hospitals, doctors and other organizations work together. This data set was released under an “Open Source Eventually” License to Strata RX attendees. The only way to get access to this data set right now, before the data set becomes Open Source next year, will be to participate in this project. Act now, because all of the really amazing discoveries in this data set will made in the next few months, by those who either attended Strata RX, or who participate in this project.

Code the change you want to see in the world

How we plan to use your money to make the data even better.

This data set can be made substantially more valuable by merging it with other “openish” data sources on the performance of doctors and hospitals. We want to turn this into the ultimate source for open doctor and hospital data.

Almost every State Medical Board in the US releases a report about the doctors in that state. This usually includes information on the doctors medical school, information about board certification and information on disciplinary actions against the doctor.

All of these state-level data sources believe that it is a appropriate to charge $50 to $1000 for copies of this data. Frequently, the states release data that is not yet linked to the NPI data. Sometimes some data is only available in PDFs etc etc. In short this data is currently available, but it is either messy, confusing and disconnected… or it is organized but expensive.

As a result it is not possible to get a full profile for a particular doctor, as they potentially move between states, without paying for expensive data aggregation services. These services charge as much as $150 to data on a single doctor. At those kinds of prices, there is simply no way that a data scientist can afford to really do any significant work on doctor data.

This crowd funded project will enable us to purchase all of this data from the various public sources that sell it, and then to perform the conversion required to merge this data with the core NPI database. Our calculations indicate that for $15k we can comfortably get the state medical board data from every state in the union.

We want to release this data back into the open data community! We will provide this data in clean formats such as csv, json or xml. But we also want to be able to provide exclusive access to this data set as a reward for participating in this crowd funding project. We came up with “Open Source Eventually” as a perfect compromise.

How “Open Source Eventually” works.

Our compromise is to use an “Open Source Eventually” license for the data. If you contribute $100 to this campaign we will provide you with private access to this data for six months before it automatically reverts to a Creative Commons license. (specifically the Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0))

$100 for six months of exclusive access to one of the most detailed social graphs ever available is pretty reasonable. The whole point is to enable researchers who are willing to help us study this data in the open to have cheap access to a rich data set. If you are willing to innovate in the open, then your expenses should be minimized.

After six months, this data will become available to everyone under the above license. For $100 you get early access. That means that you get to be the one to write new software, submit the new NEJM article or whatever. All of the cool discoveries in this data set should happen in the first six months.

However, if you want to take this “referrals and more” database, and you want to merge it with your proprietary dataset or application then you will need to contribute more money. This way, those who are seeking to capture value from this data (by building or extending a business in some way) will help more to contribute to the open research from this data. A proprietary friendly license will allow you do anything you want with this data, with the exception of merely republishing it.

We might continue to sell the data after this Medstartr campaign, but we will at least double our prices for access to the data after the Medstartr is over. The people who participate in this fund-raising campaign will have the best price for this data set.

What will you do with the money if you get more than 15k?

A nice vacation in Hawaii. Just kidding.

There are lots of holes in the data that we have. We do not have referral data for the doctors who serve veterans in the VA, we do not have any referral information about kid doctors. But we think we could fix that with further FOIA requests to the VA and Medicaid. We would also like to see if we could get the graph for doctors who get money from CMS in different ways (i.e. Medicare Advantage etc).

There is information about hospitals that is available from the IRS, or from the hospitals themselves. There are some interesting data sets regarding surgeons relationships with implanted device manufacturers. The list of wonderful things that need to be added to this data, just goes on and on.

We are pretty confident that we can continue adding new data to this open data set up to around $100k. As long as you keep giving money, we will keep increasing the amount of data we give data back to you. The more money we get, the more data we will be providing to you. Which brings us to:

Why should I support this project even if it is already funded? Won’t I get all of this data eventually anyways if I wait long enough?

Yes, that is true. The main benefit that supporters of this project get is early access to this data. But if you can afford to contribute $100 more, then we can find some more data to add to this open data set. Who knows, maybe that extra money will enable you to get clean data in just the format that you need, to enable your data research process.

This is awesome and I want you to get a specific data set for me.

If you are willing to sponsor this project at the $5000 level, then we will actively consult you regarding what data to acquire next. We are specifically committing to “freeing” all of the state medical board information, but there are lots of directions to go in next. Our $5000 level sponsors (whether they want credit or would like to stay anonymous) will help us determine how we spend any money over $15k. Of course, sponsorship at this level also includes any of the other rewards, including access to the data set with a proprietary license.

What happens if you do not meet your goals

Simple. No harm no foul. We have plenty of other work that pays better than this, so do not think we will go hungry or anything. We are doing this as one of our side-projects to benefit the data science community. If they data science community does not need/want this data, then we will live.

But, as an added reward, if you were willing to contribute to this Medstartr campaign and it did not make, then we will still provide you with a copy of the referral data set that we released at Strata RX. Participating in this Medstartr at the $100 level or above, is the only way to get a copy besides attending Strata RX. And, if you commit to pay, you will get a copy of this data no matter what.

If we make, we will be able to provide you with 10’s of GB of data about specific doctors and hospitals, but even if we do not make, we will provide you with 2 or 3, just for believing. So, it’s pretty much win-win for you. If we make you pay and get a huge amount of doctor data. If we do not make, then you get early access to a little doctor data for free!!

About US

NotOnly Dev is a Health IT software incubator company formed by Fred Trotter, Rick Trotter and Ashish Patel. We are a “not-only-for-profit” company. Of course, we are still a for-profit endeavor, but we have a very specific social mission: To use software and data to empower patients. On some projects, we make money. On others our goal is to make patient’s lives better. Most of the time, we can find ways to do a little of both at the same time. You are welcome to hire us for your healthcare development project. We encourage that.

Twitter: @fredtrotter
For more crazy ideas: Patient Skunkworks Projects

Tldr summary

You give us money. We give you lots of doctor data.

What does the data look like?

Here is a sample that shows what the file looks like when searching (using grep) for a specific NPI, in this case Methodist Hospital in Houston TX

>grep 1548387418 refer.2011.csv > Methodist_Hospital_Referrals.csv
Results in the following data. It is of the form:
NPI_Seen_First,NPI_Seen_Second,Seen_Count

1184710477,1548387418,55
1548387418,1326047754,62
1548387418,1598971913,24
1548387418,1558430330,254
1548387418,1154308633,74
1548387418,1942276605,76
1548387418,1659412336,5643
1902898455,1548387418,41
1548387418,1861490005,76
1730260035,1548387418,57
1033190681,1548387418,15
1679678767,1548387418,132
1710982798,1548387418,114

Here is the link to the full results of that search http://pastebin.com/E7Mv8RmL

Thank you for your interest and support!

Yours Truly,
Fred Trotter

Rewards

For $ 10 or more

16 Supporter(s)

I just want to browse: 1 YEAR OF UNLIMITED SEARCHES using the web portal that we are building for browsing the data set ($30 after Medstartr)

For $ 50 or more

13 Supporter(s)

CODE THE CHANGE SHIRT: The t-shirt will feature the phrases: "I hacked the healthcare graph" and "code the change you want to see in the world".

For $ 100 or more

23 Supporter(s)

OPEN SOURCE DATA PURCHASE: You will get the entire database under an Open Source Eventually (viral) data license. This will give you access to everything, but you will not be able to integrate this data with any data that you are unwilling to release. See the text for what we mean by "Eventually".

For $ 105 or more

16 Supporter(s)

The t-shirt and the data: If you want a t-shirt and the open source data, that will cost you.

For $ 250 or more

1 Supporter(s)

A limited edition print, celebrating the release of this data set, from renowned patient artist Regina Holiday. Her art frequently goes for $5k at auction, so these should almost immediately be worth more than you paid for them. Plus they are dripping with awesome. And, you still get a T-shirt!

For $ 1000 or more

11 Supporter(s)

Proprietary-friendly Data License: This will ensure that you are able to use all of this data in any way you like (except just offering it for direct download, etc etc) without concern that your own data/software would need to be released. If you want to build a proprietary product with this data set, this backing level is for you.

For $ 5000 or more

1 Supporter(s)

Loud Partner: If you would like to specifically sponsor our work, and/or you would like to help direct how we gather data beyond the $15k mark, this level is for you. As a bonus, this level will include full credit for sponsoring.

For $ 5000 or more

0 Supporter(s)

Silent Partner: If you would like to specifically sponsor our work, and/or you would like to help direct how we gather data beyond the $15k mark, this level is for you. As a bonus, this level will include us never telling anyone that you sponsored.

For $ 10000 or more

0 Supporter(s)

GRAPH YOUR NETWORK: If you want your own network of doctors analyzed using our GUI tools (including the graph laid out on a map) This is for you. Includes 10 hours of on-site consulting.

    No updates found .

    No comments found .

Login to post your comment! Click here to Login

image
Kathleen Oyola

backed on 11/08/2012

image
Kathy Lewis

backed on 11/08/2012

image
Kenneth Purfey

backed on 11/08/2012

image
Jonah Platt

backed on 11/08/2012

image
Gangadhar Sulkunte

backed on 11/07/2012

image
James Rosoff

backed on 11/07/2012

image
Marco Caicedo

backed on 11/07/2012

image
Bruce Ramshaw

backed on 11/07/2012

image
Ann Becker-Schutte

backed on 11/07/2012

image
Michelle Litchman

backed on 11/07/2012

image
Shayan Shirazian

backed on 11/07/2012

image
Vivian Sun

backed on 11/07/2012

image
Irene Osborn

backed on 11/07/2012

image
Chireh-Yilob Andrew

backed on 11/07/2012

image
Barbara Waldorf

backed on 11/07/2012

image
Brian Ahier

backed on 11/07/2012

image
Andrea Fafford

backed on 11/07/2012

image
Sharon Donat

backed on 11/07/2012

image
Andre Blackman

backed on 11/07/2012

image
John Moehrke

backed on 11/07/2012

image
Sandra Lee

backed on 11/07/2012

image
Pam O

backed on 11/07/2012

image
Symplur LLC

backed on 11/07/2012

image
Lely Fernandez

backed on 11/07/2012

image
mark dale

backed on 11/06/2012

image
Suzanne McKeon

backed on 11/06/2012

image
Linda Capcara

backed on 11/06/2012

image
Mir Hajmiragha

backed on 11/06/2012

image
Lisa Shafer Amrine

backed on 11/06/2012

image
Ess Kay

backed on 11/06/2012

image
Reginald Edward

backed on 11/06/2012

image
Subidita Chatterjee

backed on 11/06/2012

image
Carolyn Burke

backed on 11/06/2012

image
Irene Healey

backed on 11/06/2012

image
Robert Speigel

backed on 11/06/2012

image
V Bell

backed on 11/06/2012

image
Julia Hallisy

backed on 11/06/2012

image
Andrew Spong

backed on 11/06/2012

image
Peter Levin

backed on 11/06/2012

image
Matthew Katz

backed on 11/06/2012

image
Fransiska Hadiwidjana

backed on 11/05/2012

image
diane stollenwerk

backed on 11/05/2012

image
Sharon Aubuchon

backed on 11/04/2012

image
David Ronce

backed on 11/04/2012

image
Anna Adeyemo

backed on 11/04/2012

image
Erin Gertz

backed on 11/04/2012

image
Eric Strausser

backed on 11/04/2012

image
Eugene

backed on 11/02/2012

image
George Larkin

backed on 11/02/2012

image
Iva Worthington

backed on 11/02/2012

image
Marjorie Grad

backed on 11/02/2012

image
Marc Paquin

backed on 11/02/2012

image
David Adelson

backed on 11/02/2012

image
Michael Schwartz

backed on 11/02/2012

image
S Turner Dean

backed on 11/01/2012

image
Sam Jones

backed on 11/01/2012

image
Debbie Rodgers

backed on 10/31/2012

image
Melanie Brown

backed on 10/31/2012

image
Babak

backed on 10/30/2012

image
Steven Deal

backed on 10/30/2012

image
Joe Fisher

backed on 10/30/2012

image
Matthew Fraser

backed on 10/30/2012

image
Samirah Majumdar

backed on 10/30/2012

image
Sandy Taylor

backed on 10/30/2012

image
john brohan

backed on 10/30/2012

image
hhadley

backed on 10/30/2012

image
Jeffrey Shepard

backed on 10/30/2012

image
Robin Streveler

backed on 10/30/2012

image
Andrew Wong

backed on 10/29/2012

image
Dave Matz

backed on 10/29/2012

image
Richard Caro

backed on 10/29/2012

image
laura armstrong

backed on 10/29/2012

image
Lauren Hopper

backed on 10/29/2012

image
Doug Hall

backed on 10/29/2012

image
Jennifer Ambrose

backed on 10/29/2012

image
Dale Ann Springer

backed on 10/28/2012

image
Andrew Eckerman

backed on 10/28/2012

image
Dan Taracks

backed on 10/28/2012

image
David M.

backed on 10/26/2012

image
Alejandro

backed on 10/26/2012

image
Sam Visaisouk

backed on 10/26/2012

image
Steve Deal

backed on 10/26/2012

image
Satec Healthcare Solutions

backed on 10/24/2012

image
Dan Abrams

backed on 10/24/2012

image
Barry Peerless

backed on 10/24/2012

image
Brandon Tasset

backed on 10/24/2012

image
Julie Munro

backed on 10/24/2012

image
Carla Canlas

backed on 10/23/2012

image
heatherheine@gmail.com

backed on 10/23/2012

image
Innov8

backed on 10/23/2012

image
IFGHealth

backed on 10/23/2012

image
Minerva Health

backed on 10/23/2012

image
MedaCheck

backed on 10/23/2012

image
Empowering Innovations

backed on 10/23/2012

image
Tim Gutierrez

backed on 10/22/2012

image
Judy Fitzgerald

backed on 10/19/2012

image
Jamie Davis

backed on 10/19/2012

image
Linda Donofrio

backed on 10/19/2012

image
Kourtney Govro

backed on 10/19/2012

image
Rob Speigel

backed on 10/19/2012

image
Apurv Gupta

backed on 10/18/2012

image
elaheh

backed on 10/17/2012

image
Lyn Culbert

backed on 10/17/2012

image
Lisa Lehtinen

backed on 10/17/2012

image
Robert Goldberg

backed on 10/17/2012

image
Aneesh Kapur

backed on 10/17/2012

image
Mike Sevilla

backed on 10/17/2012

image
Ruth Ann Crystal

backed on 10/17/2012

image
Leticia Cano

backed on 10/16/2012

image
anonymous

backed on 10/16/2012

image
Janet Fisher

backed on 10/16/2012

image
Scott Strange

backed on 10/15/2012

image
Lily Neff

backed on 10/15/2012

image
Grenville Wilson

backed on 10/15/2012

Important Disclosure: MedStartr.com is a website owned and operated by MedStartr, Inc., which is not a broker-dealer, funding portal or investment advisor; and neither the website nor MedStartr, Inc. participate in the offer or sale of securities. All securities related activity is conducted through Young America Capital, LLC, a registered broker-dealer and member of FINRA/SIPC. No communication, through this website, email or in any other medium, should be construed as a recommendation for any securities offering.