برندینگ و برندسازی: python - Inserting 1 billion rows in MySQL fast

۱۳۹۶ مرداد ۱۱, چهارشنبه

python - Inserting 1 billion rows in MySQL fast

[ad_1]

I am currently trying to insert roughly a billion rows of data in a MySQL table. I am pulling my data from a directory of .JSON files where each .JSON file contains ~200K rows. There are 5K files total.

What I am currently doing, is going through each file and creating a tuple which contains the rows I want to insert. I am putting this tuple in a list and after I get through the whole JSON file, I insert the list of rows in MySQL. This is faster than inserting each row into SQL at a time, but this is still going to take me over 3 days and I don't have the time to spare.

I initially created a list that contained 200,000,000 rows each (which was fast to generate) but it took too long to insert in MySQL. That is why I am not only inserting every 200,000 rows. Does anyone have any advice on how to speed this up?

path = *path to my file*
for filename in glob.glob(os.path.join(path, '*.JSON')):
 myList = []
 with open(filename) as json_data:
 j = json.load(json_data)
 for i in j["rows"]:
 name = i["values"][0][0]
 age = i["values"][0][1]
 gender = i["values"][0][2]
 data = (**None**,name,age,gender)
 myList.append(data)
 cursor = conn.cursor()
 q = """INSERT INTO nordic_data values (%s,%s,%s,%s)"""
 cursor.executemany(q, myList)
 conn.commit()

[ad_2]

لینک منبع

دنبال کننده ها

۱۳۹۶ مرداد ۱۱, چهارشنبه

python - Inserting 1 billion rows in MySQL fast