Hi,
So today I played a little bit with the possibility of storing the unique IP addresses in a separate table.
Since I will use a subscription from ip-api.com, it seems that there is an option to query info by batch processing with a limit of 100 IP’s per payload.
So, at a first glance there are 227200 unique ip’s in my dataset. That will account for 2272 payloads to be queried.
The code more or less looks in the following way:
unique_ip = temp['SourceIP'].unique()
unique_list = [unique_ip[i:i + 100] for i in range(0, len(unique_ip), 100)]
data = []
for i in range(len(unique_list)):
temp_dict = {}
temp_dict['id'] = i+1
temp_dict['payload'] = unique_list[i].tolist()
data.append(temp_dict)
Once this is constructed you only need to parse the list element by element and insert it to MongoDB using this code:
import pymongo
myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
mycol = mydb["unique_ip"]
for i in range(len(data)):
mycol.insert_one(data[i])
Next step will involve taking the collection one by one and serve it to the API endpoint.
Tnx,
Sorin