## How to code Influence maximization problem in Python

Our decisions on buying some new product, following a trends are usually influenced by our friends and family. This forms the basis for incorporating social structures into consideration while planning a marketing policy. In this regard, marketing people highly interested to know which person hold the strongest influence in their society. So if somehow those people can be convinced for initial activation ( buying new product, following a trend or opinion) would trigger maximum influence spread in the society.

There are many models for this problem. We are going to use independent cascade model. According to this model, each node gets a single chance to influence each of its neighbors through their corresponding edges. Each edge is labelled with a probability known as propagation probability. This probability specify the chances of getting influenced of a node by its neighbors.

Influence maximization problem formally can be defined as finding k-node set which causes maximum information spread in the society. In this post, We will learn how to code influence maximization (independent cascade model) problem in python.

### Algorithm

- Select seed set based on some heuristic
- influenced_set=seed_set
- For each node in influenced_set repeat 4 to 6
- For all of its neighbors repeat 5 to 6
- generate a random number
- if this number is less than p add it to influenced_set
- print length(influenced_set)

Monte carlo simulations are used to find out influence of a node. To achieve the results close to accurate results, simulations have to be performed numerous time.

## Python code

For demonstration purpose we will use a random network and will consider same propagation probability between each edge pair.

First, we will generate a random graph using graph generators available in `networkx`

import networkx as nx import random G=nx.gnp_random_graph(100,.02) print nx.info(G) nx.draw(G) plt.show()

Output

Name: gnp_random_graph(100,0.02) Type: Graph Number of nodes: 100 Number of edges: 102 Average degree: 2.0400

""" Information cascade on independent cascade model Propagation probability p=.01 """ iterations=10000 # Number of Iterations p=.01 # Propagation probability seed_set=random.sample(G.nodes(),4) # Selecting intial seed set randomly print 'Selected Seeds:',seed_set avg_influence=0.0 for i in range(iterations): S=list(seed_set) for i in range(len(S)): # Process each node in seed set for neighbor in G.neighbors(S[i]): if random.random()<p: # Generate a random number and compare it with propagation probability if neighbor not in S: S.append(neighbor) avg_influence+=(float(len(S))/iterations) print 'Total influence:',int(round(avg_influence))