Social network graph

Social Network Graph

1 Social Networks: friends recommendations and more


Have you ever wondered how social networks, such as Facebook, recommend friends to you? Most of the social networks usehighly sophisticated algorithms for this, but for this assignment you will implement a fairly naive algorithm to recommend themost likely new friend to users of a social network. In particular, you will recommend the most probable user to befriend based
upon the intersection of your common friends. In other words, the user that you will suggest to Person A is the person whohas the most friends in common with Person A, but who currently is not friends with Person A.
Five text files have been provided for you to run your program with. Each represents a social network. Three are smalltest files containing a made-up set of users and their friendships (these files are net1.txt, net2.txt and net3.txt). The two are asubset of a real Facebook dataset, which was obtained from: fhttps://snap.stanford.edu/data/egonets-Facebook.html
The format of all five files is the same:
The first line of the file is an integer representing the number of users in the given network.
The following lines are of the form: user_uuser_vwhere user_uand user_vare the (non-negative integer) IDs of two userswho are friends.
In addition user_uis always less than user_v
For example, here is a very small file that has 5 users in the social network: 5
0 1
1 2
1 8
2 3
The above is a representation of a social network that contains 5 users.
User ID=0 is friends with User IDs = 1
User ID=1 is friends with User IDs = 0, 2, 8
User ID=2 is friends with User IDs = 1, 3
User ID=3 is friends with User IDs = 2
User ID=8 is friends with User IDs = 1
Spend time studying the above small example to understand the model. For example, notice that since friendship is asymmetric relationship the social media networks in this assignment, if user_uis friends with user_v, that means that user_vis also friends with user_u. Such “duplicate” friendships are not present in the file. In particular each friendship is listed once
in such way that user_u<user_v
Also note that, while you can assume that user IDs are sorted, you cannot assume that they are consecutive integersdiffering by one. For example the user IDs above are: 0,1,2,3,8.
You can also assume that in each file the users are sorted from smallest to largest (in the above example you see that usersappear as: 0 1 1 2). Specifically, friendships of user_uappear before friendships of user_vif and only if user_u<user_v.
And also for each user its friends appear sorted, for example for user 1 friendship with friend 2 appears before friendship withfriend 4.
To complete the assignment you will have to code the following 9 functions. I strongly recommend you code the in the ordergiven below and do not move onto coding a function until you complete all before. The function descriptions, including whatthey need to do, are given in a4_xxxxxx.py.

  1. create_network(file_name):This is the most important (and possibly the most difficult) function to solve.The function needs to read a file and return a list of tuples representing the social network from the file. In particular thefunction returns a list of tuples where each tuple has 2 elements: the first is an integer representing an ID of a user andthe second is the list of integers representing his/her friends. In the a4_xxxxxx.py I refer the list that create_networkfunction returns as a 2D-list for friendship network (although one can argue that is is a 3D list). In addition the2D-list for friendship network that must create_networkfunction returns must be sorted by the ID and a listof friends in each tuple also must be sorted.
    So for the example above, this function should return the following 2D-list for 2D-list for friendship network:
    [(0, [1]), (1, [0,2,4]), (2,[1,3]), (3,[2]), (8,[1])]
    More examples:
    >>> net1=create_network(“net1.txt”)
    >>> net1
    [(0, [1, 2, 3]), (1, [0, 4, 6, 7, 9]), (2, [0, 3, 6, 8, 9]), (3, [0, 2, 8, 9]), (4, [1, 6, 7, 8]),(5, [9]), (6, [1, 2, 4, 8]), (7, [1, 4, 8]), (8, [2, 3, 4, 6, 7]), (9, [1, 2, 3, 5])]

>>> net2=create_network(“net2.txt”)
>>> net2
[(0, [1, 2, 3, 4, 5, 6, 7, 8, 9]), (1, [0, 4, 6, 7, 9]), (2, [0, 3, 6,8, 9]), (3, [0, 2, 8, 9]), (4, [0, 1, 6, 7, 8]),
(5, [0, 9]), (6, [0, 1, 2, 4, 8]), (7, [0, 1, 4, 8]), (8, [0, 2, 3, 4, 6, 7]), (9, [0, 1, 2, 3, 5])]
>>> net3=create_network(“net3.txt”)
>>>
[(0, [1, 2, 3, 4, 5, 6, 7, 8, 9]), (1, [0, 4, 6, 7, 9]), (2, [0, 3, 6,8, 9]), (3, [0, 2, 8, 9]), (4, [0, 1, 6, 7, 8]),
(5, [0, 9]), (6, [0, 1, 2, 4, 8]), (7, [0, 1, 4, 8]), (8, [0, 2, 3, 4, 6, 7]), (9, [0, 1, 2, 3, 5]),(100, [112]), (112, [100, 114]), (114, [112])]
>>> net4=create_network(“big.txt”)
>>> net4[500:502]
[(500, [348, 353, 354, 355, 361, 363, 368, 373, 374, 376, 378, 382, 388, 391, 392, 396, 400, 402

  1. getCommonFriends(user1, user2, network)
    >>>getCommonFriends(3,1,net1)
    [0, 9]
    >>>getCommonFriends(0,112,net3)
    []
    >>>getCommonFriends(217,163,net4)
    [0, 100, 119, 150]
  2. recommend(user, network)
    Read the docstrings to understand how this function should work. Understand why the given friends are recommended
    in the examples below including why no friend is recommended for 0 in net2 and 112 in net 3.
    >>> recommend(6,net1)
    7
    >>> recommend(4,net2)
    2
    >>> recommend(0,net2)
    >>> recommend(114, net3)
    100
    >>> recommend(112,net3)
    >>> recommend(217,net4)
    163
  3. k_or_more_friends(network, k)
    >>>k_or_more_friends(net1, 5)
    3
    >>>k_or_more_friends(net2, 8)
    1
    >>>k_or_more_friends(net3, 12)
    0
    >>>k_or_more_friends(net4, 70)
    33
    5. maximum_num_friends(network) (5 points)
    >>>maximum_num_friends(net1)
    5
    >>>maximum_num_friends(net2)
    9
    >>>maximum_num_friends(net3)
    9
    >>>maximum_num_friends(net4)
    347
  4. people_with_most_friends(network)

>>>people_with_most_friends(net1)
[1, 2, 8]
>>>people_with_most_friends(net2)
[0]
>>>people_with_most_friends(net3)
[0]
>>>people_with_most_friends(net4)
[0]

  1. average_num_friends(network)
    >>>average_num_friends(net1)
    3.8
    >>>average_num_friends(net2)
    5.0
    >>>average_num_friends(net3)
    4.153846153846154
    >>>average_num_friends(net4)
    19.7
  2. knows_everyone(network)
    >>>knows_everyone(net1)
    False
    >>>knows_everyone(net2)
    True
    >>>knows_everyone(net3)
    False
    >>>knows_everyone(net4)
    False
  3. get_uid()Here is how this function should run if net1.txt was enter when “Run Module” is pressed.
    >>>get_uid()
    Enter an integer for a user ID:alsj
    That was not an integer. Please try again.
    Enter an integer for a user ID: twenty
    That was not an integer. Please try again.
    Enter an integer for a user ID:9aslj
    That was not an integer. Please try again.
    Enter an integer for a user ID:100000
    That was not an integer. Please try again.
    Enter an integer for a user ID: 4.5
    That was not an integer. Please try again.
    Enter an integer for a user ID: -10
    That user ID does not exist. Try again.
    Enter an integer for a user ID: 7
    7

1.1 Bonus
This assignment offers up to 10% bonus. The bonus is available for the maximum of 30-40 students. In order for a student tohave a chance to get the bonus the following needs to happen:
1. She needs to correctly solve/code create_network and getCommonFriends functions. She will know that that is the caseby waiting for the grading to be completed and seeing that she did not loose any points on these 2 functions; and
2. She needs to solve getCommonFriends(user1, user2, network) with running time
O(n1 + n2 + log n) were n is the the total number of users in the network, n1 is the number of friends of user1 and n2 isthe number of friends of user2. In other words, n1 =len(user1), n2 =len(user2) and n =len(network).
Note that on a typical network O(n1+n2+log n) is much better than O(n) since a network like Facebook has n roughly 2billion and the average number of friends per user is 338. Thus the number of operations an O(n) solution would do, would be in the order of a billion, roughly. While the number of operations an O(num_friends_user1 + num_friends_user2 + log n)
solution would do, would be in the order of, O(338 + 338 + 21), thousand, roughly
Thus O(n) solutions will not be accepted for the bonus.
To determine the running times of Python’s functions on lists you can use this link (although it is not quite correct as itis amortized, which they incorrectly call average, analysis and not the worst case analysis).
https://wiki.python.org/moin/TimeComplexity
Again you cannot use sets nor dictionaries nor deque nor bisect module.
3. Finally, to obtain the bonus the student needs to come to my office hours between November 27th (or in extra officehours I will give that week) in order to explain her faster solution.

import random

defcreate_network(file_name):

”'(str)->list of tuples where each tuple has 2 elements the first is int and the second is list of int

Precondition: file_name has data on social netowrk. In particular:

The first line in the file contains the number of users in the social network

Each line that follows has two numbers. The first is a user ID (int) in the social network,

the second is the ID of his/her friend.

The friendship is only listed once with the user ID always being smaller than friend ID.

For example, if 7 and 50 are friends there is a line in the file with 7 50 entry, but there is line 50 7.

There is no user without a friend

Users sorted by ID, friends of each user are sorted by ID

Returns the 2D list representing the frendshipnework as described above

where the network is sorted by the ID and each list of int (in a tuple) is sorted (i.e. each list of friens is sorted).

”’

friends = open(file_name).read().splitlines()

network=[]

# YOUR CODE GOES HERE

return network

defgetCommonFriends(user1, user2, network):

”'(int, int, 2D list) ->int

Precondition: user1 and user2 IDs in the network. 2D list sorted by the IDs,

and friends of user 1 and user 2 sorted

Given a 2D-list for friendship network, returns the sorted list of common friends of user1 and user2

”’

common=[]

# YOUR CODE GOES HERE

return common

def recommend(user, network):

”'(int, 2Dlist)->int or None

Given a 2D-list for friendship network, returns None if there is no other person

who has at least one neighbour in common with the given user and who the user does

not know already.

Otherwise it returns the ID of the recommended friend. A recommended friend is a person

you are not already friends with and with whom you have the most friends in common in the whole network.

If there is more than one person with whom you have the maximum number of friends in common

return the one with the smallest ID. ”’

# YOUR CODE GOES HERE

pass

defk_or_more_friends(network, k):

”'(2Dlist,int)->int

Given a 2D-list for friendship network and non-negative integer k,

returns the number of users who have at least k friends in the network

Precondition: k is non-negative”’

# YOUR CODE GOES HERE

pass

defmaximum_num_friends(network):

”'(2Dlist)->int

Given a 2D-list for friendship network,

returns the maximum number of friends any user in the network has.

”’

# YOUR CODE GOES HERE

pass

defpeople_with_most_friends(network):

”'(2Dlist)->1D list

Given a 2D-list for friendship network, returns a list of people (IDs) who have the most friends in network.”’

max_friends=[]

# YOUR CODE GOES HERE

returnmax_friends

defaverage_num_friends(network):

”'(2Dlist)->number

Returns an average number of friends overs all users in the network”’

# YOUR CODE GOES HERE

pass

defknows_everyone(network):

”'(2Dlist)->bool

Given a 2D-list for friendship network,

returns True if there is a user in the network who knows everyone

and False otherwise”’

# YOUR CODE GOES HERE

pass

####### CHATTING WITH USER CODE:

defis_valid_file_name():

”’None->str or None”’

file_name = None

try:

file_name=input(“Enter the name of the file: “).strip()

f=open(file_name)

f.close()

exceptFileNotFoundError:

print(“There is no file with that name. Try again.”)

file_name=None

returnfile_name

defget_file_name():

”'()->str

Keeps on asking for a file name that exists in the current folder,

until it succeeds in getting a valid file name.

Once it succeeds, it returns a string containing that file name”’

file_name=None

whilefile_name==None:

file_name=is_valid_file_name()

returnfile_name

defget_uid():

”'()->int

Keeps on asking for a user ID that exists in the network

until it succeeds. Then it returns it”’

# YOUR CODE GOES HERE

pass

##############################

# main

##############################

# NOTHING FOLLOWING THIS LINE CAN BE REMOVED or MODIFIED

file_name=get_file_name()

net=create_network(file_name)

print(“\nFirst general statistics about the social network:\n”)

print(“This social network has”, len(net), “people/users.”)

print(“In this social network the maximum number of friends that any one person has is “+str(maximum_num_friends(net))+”.”)

print(“The average number of friends is “+str(average_num_friends(net))+”.”)

mf=people_with_most_friends(net)

print(“There are”, len(mf), “people with “+str(maximum_num_friends(net))+” friends and here are their IDs:”, end=” “)

for item in mf:

print(item, end=” “)

print(“\n\nI now pick a number at random.”, end=” “)

k=random.randint(0,len(net)//4)

print(“\nThat number is: “+str(k)+”. Let’s see how many people has that many friends.”)

print(“There is”, k_or_more_friends(net,k), “people with”, k, “or more friends”)

ifknows_everyone(net):

print(“\nThere at least one person that knows everyone.”)

else:

print(“\nThere is nobody that knows everyone.”)

print(“\nWe are now ready to recommend a friend for a user you specify.”)

uid=get_uid()

rec=recommend(uid, net)

if rec==None:

print(“We have nobody to recommend for user with ID”, uid, “since he/she is dominating in their connected component”)

else:

print(“For user with ID”, uid,”we recommed the user with ID”,rec)

print(“That is because users”, uid, “and”,rec, “have”, len(getCommonFriends(uid,rec,net)), “common friends and”)

print(“user”, uid, “does not have more common friends with anyone else.”)

print(“\nFinaly, you showed interest in knowing common friends of some pairs of users.”)

print(“About 1st user …”)

uid1=get_uid()

print(“About 2st user …”)

uid2=get_uid()

print(“Here is the list of common friends of”, uid1, “and”, uid2)

common=getCommonFriends(uid1,uid2,net)

for item in common:

print(item, end=” “) 

Solution 

import random

defbinary_search(user, network):

”'(int, 2D list) ->list

Precondition: 2D list sorted by the IDs

Returns a list of user’s friends if user ID is

present in the network and None if user is not

found.

”’

i, j = 0, len(network) – 1

k = (i + j) // 2

while i <= j:

if network[k][0] < user:

i = k + 1

elif network[k][0] > user:

j = k – 1

else: # user found

return network[k][1]

k = (i + j) // 2

# user not found

return None

defcreate_network(file_name):

”'(str)->list of tuples where each tuple has 2 elements the first is int and the second is list of int

Precondition: file_name has data on social netowrk. In particular:

The first line in the file contains the number of users in the social network

Each line that follows has two numbers. The first is a user ID (int) in the social network,

the second is the ID of his/her friend.

The friendship is only listed once with the user ID always being smaller than friend ID.

For example, if 7 and 50 are friends there is a line in the file with 7 50 entry, but there is line 50 7.

There is no user without a friend

Users sorted by ID, friends of each user are sorted by ID

Returns the 2D list representing the frendshipnework as described above

where the network is sorted by the ID and each list of int (in a tuple) is sorted (i.e. each list of friens is sorted).

”’

friends = open(file_name).read().splitlines()

network=[]

# YOUR CODE GOES HERE

all_users = []

for pair in friends[1:]:     # skip the number of users

user_u, user_v = pair.split()

# convertstr to int

user_u, user_v = int(user_u), int(user_v)

# add both pairs

all_users.append((user_u, user_v))

all_users.append((user_v, user_u))

foruser_u, user_v in sorted(all_users):

if network == [] or network[-1][0] != user_u:

network.append((user_u, [user_v]))

else:   # append to the last user, since all_users is sorted

network[-1][1].append(user_v)

return network

defgetCommonFriends(user1, user2, network):

”'(int, int, 2D list) ->int

Precondition: user1 and user2 IDs in the network. 2D list sorted by the IDs,

and friends of user 1 and user 2 sorted

Given a 2D-list for friendship network, returns the sorted list of common friends of user1 and user2

”’

common=[]

# YOUR CODE GOES HERE

# O(log n)

friends1 = binary_search(user1, network)

friends2 = binary_search(user2, network)

# O(n1 + n2)

for friend in friends1:

if friend in friends2:

common.append(friend)

# total complexity: O(n1 + n2 + log n)

return common

def recommend(user, network):

”'(int, 2Dlist)->int or None

Given a 2D-list for friendship network, returns None if there is no other person

who has at least one neighbour in common with the given user and who the user does

not know already.

Otherwise it returns the ID of the recommended friend. A recommended friend is a person

you are not already friends with and with whom you have the most friends in common in the whole network.

If there is more than one person with whom you have the maximum number of friends in common

return the one with the smallest ID. ”’

# YOUR CODE GOES HERE

# locate user’s friends

friends = binary_search(user, network)

recommended = None

common_with_recommended = 0

for u, frnd in network:

# do not count itself or own friends

if u != user and u not in friends:

common = getCommonFriends(user, u, network)

# have more common friends with u, so u is now recommented

iflen(common) >common_with_recommended:

common_with_recommended = len(common)

recommended = u

return recommended

defk_or_more_friends(network, k):

”'(2Dlist,int)->int

Given a 2D-list for friendship network and non-negative integer k,

returns the number of users who have at least k friends in the network

Precondition: k is non-negative”’

# YOUR CODE GOES HERE

returnlen([user for user, friends in network if len(friends) >= k])

defmaximum_num_friends(network):

”'(2Dlist)->int

Given a 2D-list for friendship network,

returns the maximum number of friends any user in the network has.

”’

# YOUR CODE GOES HERE

return max(len(friends) for user, friends in network)

defpeople_with_most_friends(network):

”'(2Dlist)->1D list

Given a 2D-list for friendship network, returns a list of people (IDs) who have the most friends in network.”’

max_friends=[]

# YOUR CODE GOES HERE

max_num_friends = maximum_num_friends(network)

for user, friends in network:

iflen(friends) == max_num_friends:

max_friends.append(user)

returnmax_friends

defaverage_num_friends(network):

”'(2Dlist)->number

Returns an average number of friends overs all users in the network”’

# YOUR CODE GOES HERE

return sum(len(friends) for user, friends in network) / len(network)

defknows_everyone(network):

”'(2Dlist)->bool

Given a 2D-list for friendship network,

returns True if there is a user in the network who knows everyone

and False otherwise”’

# YOUR CODE GOES HERE

for user, friends in network:

# do not count itself

iflen(friends) == len(network) – 1:

return True

return False

####### CHATTING WITH USER CODE:

defis_valid_file_name():

”’None->str or None”’

file_name = None

try:

file_name=input(“Enter the name of the file: “).strip()

f=open(file_name)

f.close()

exceptFileNotFoundError:

print(“There is no file with that name. Try again.”)

file_name=None

returnfile_name

defget_file_name():

”'()->str

Keeps on asking for a file name that exists in the current folder,

until it succeeds in getting a valid file name.

Once it succeeds, it returns a string containing that file name”’

file_name=None

whilefile_name==None:

file_name=is_valid_file_name()

returnfile_name

defget_uid(network):

”'(2Dlist)->int

Keeps on asking for a user ID that exists in the network

until it succeeds. Then it returns it”’

# YOUR CODE GOES HERE

have_uid = False

while not have_uid:

uid = input(‘Enter an integer for a user ID:’)

try:

uid = int(uid)

# globalvar is used — don’t see any other way

ifbinary_search(uid, network):

have_uid = True

else:

print(‘That user ID does not exist. Try again.’)

exceptValueError:

print(‘That was not an integer. Please try again.’)

returnuid

##############################

# main

##############################

# NOTHING FOLLOWING THIS LINE CAN BE REMOVED or MODIFIED

file_name=get_file_name()

net=create_network(file_name)

print(“\nFirst general statistics about the social network:\n”)

print(“This social network has”, len(net), “people/users.”)

print(“In this social network the maximum number of friends that any one person has is “+str(maximum_num_friends(net))+”.”)

print(“The average number of friends is “+str(average_num_friends(net))+”.”)

mf=people_with_most_friends(net)

print(“There are”, len(mf), “people with “+str(maximum_num_friends(net))+” friends and here are their IDs:”, end=” “)

for item in mf:

print(item, end=” “)

print(“\n\nI now pick a number at random.”, end=” “)

k=random.randint(0,len(net)//4)

print(“\nThat number is: “+str(k)+”. Let’s see how many people has that many friends.”)

print(“There is”, k_or_more_friends(net,k), “people with”, k, “or more friends”)

ifknows_everyone(net):

print(“\nThere at least one person that knows everyone.”)

else:

print(“\nThere is nobody that knows everyone.”)

print(“\nWe are now ready to recommend a friend for a user you specify.”)

uid=get_uid(net)

rec=recommend(uid, net)

if rec==None:

print(“We have nobody to recommend for user with ID”, uid, “since he/she is dominating in their connected component”)

else:

print(“For user with ID”, uid,”we recommend the user with ID”,rec)

print(“That is because users”, uid, “and”,rec, “have”, len(getCommonFriends(uid,rec,net)), “common friends and”)

print(“user”, uid, “does not have more common friends with anyone else.”)

print(“\nFinally, you showed interest in knowing common friends of some pairs of users.”)

print(“About 1st user …”)

uid1=get_uid(net)

print(“About 2st user …”)

uid2=get_uid(net)

print(“Here is the list of common friends of”, uid1, “and”, uid2)

common=getCommonFriends(uid1,uid2,net)

for item in common:

print(item, end=” “)