FindS Algorithm Implementation in Python
FindS algorithm tends to find out the most specific hypothesis which is consistent with given training data.
Let’s say we are given some training data X. Data instances in X are labelled as positive ( or yes, 1) and negative ( or no, 0).
Consistent Hypothesis
A hypothesis h(x) is said to be consistent with given training data
here, t(x) is the target value or the label.
Let’s understand above statement with the help of an example. We have following data
Example  Sky  Humidity  EnjoySport 

1  Sunny  Normal  Yes 
2  Rainy  High  No 
3  Sunny  High  Yes 
Along with this data we have two hypothesis which we would like to check whether they are consistent with training data or not. Our hypothesis are
here ? represent any value for Humidity attribute
Hypothesis H1
First instance is having values Sunny
and Normal
. Our hypothesis H1 classifies an instance as positive ( yes ) if instance has first value Sunny
( it accepts any value for second attribute). So H1 will classifies first instance as positive ( yes) which is given as positive (yes).
Every instance will be processed in similar manner. Below table showing results of it. Hypothesis H1 has classified each instance correctly. Hence, H1 is consistent with training data.
Example  Sky  Humidity  EnjoySport  H1 

1  Sunny  Normal  Yes  Yes 
2  Rainy  High  No  No 
3  Sunny  High  Yes  Yes 
Result : H1 is consistent.
Hypothesis H2
Hypothesis H2 classifies each instance as positive if that instance has values Sunny
and High
. If we look at the first instance, it has values Sunny and Normal. Hypotheis H2 will classify first instance as Negative or No whereas it is given as Yes. So this hypothesis is not consistent with given training data.
Example  Sky  Humidity  EnjoySport  H2 

1  Sunny  Normal  Yes  No 
2  Rainy  High  No  No 
3  Sunny  High  Yes  Yes 
Result : H2 is not consistent.
FindS Algorithm Implementation
Training Data
Example  Sky  AirTemp  Humidity  Wind  Water  Forecast  EnjoySport 

1  Sunny  Warm  Normal  Strong  Warm  Same  Yes 
2  Sunny  Warm  High  Strong  Warm  Same  Yes 
3  Rainy  Cold  High  Strong  Warm  Change  No 
4  Sunny  Warm  High  Strong  Cool  Change  Yes 
Python Implementation
# Implementation FindS # Step1 : Initiliaze h with most specific hypothesis in H # h will reject every instance h=['phi','phi','phi','phi','phi','phi'] # 'phi' indicate no value is accepted. 'any' indicates every value is accepted. # Training Data Data=[['Sunny','Warm','Normal','Strong','Warm','Same','Yes'], ['Sunny','Warm','High','Strong','Warm','Same','Yes'], ['Sunny','Warm','Normal','Strong','Warm','Same','No'], ['Sunny','Warm','High','Strong','Cool','Change','Yes'] ] # Step2 : Iterate over training data and replace constraints in hypothesis with more general constraints ''' Function: isConsistent(hypothesis,data) This function check whether given hypothesis is consitent with given data instance. It compare each attribute's value in hypothesis with respective attribute's value in data. ''' def isConsistent(h,d): # Check number of attribute is hypothesis is one less than number of attribute in data. # Since one attribute in data is class attribute which is not considered in hypothesis. if len(h)!=len(d)1: print('Number of attributes are not same in hypothesis.') return False else: # variable 'matched keeps number of attributes which are consistent in hypothesis. matched=0 # Iterate over each attribute in hypothesis for i in range(len(h)): # Check if attribute in hypothesis is equal to repsective attribute's value in data instance or # it has 'any' value. if ( (h[i]==d[i])  (h[i]=='any') ): # if condition is satisfied then increase matched matched=matched+1 # Return true if for all attribute's value in data, hypothesis is consistent. if matched==len(h): return True else: return False ''' Function: makeConsistent(hypothesis,data) This function change hypothesis to make it consistent with given data instance. ''' def makeConsistent(h,d): # Iterate over each attribute in hypothesis for i in range(len(h)): # if ith attribute in hypothesis reject each value. 'phi' indicates that each value is rejected. if((h[i] == 'phi')): # Replace ith value in hypothesis with data instance's ith attribute value. h[i]=d[i] # if hypothesis ith value is not 'phi' and it is also not equal to ith value in data instance. elif(h[i]!=d[i]): # Replace ith value in hypothesis with 'any'. 'any' accept each value for that attribute. h[i]='any' # Return updated hypothesis return h print('Begin : Hypothesis :',h) print('==========================================') # Iterate over each data instance in given training data for d in Data: # Consider only positive instance ( instance with 'Yes' class) if d[len(d)1]=='Yes': # Check whether hypothesis is consistent with current data instance if ( isConsistent(h,d)): # Print hypothesis print ("Hypothesis :",d) else: # If hypothesis is not consistent then make it consistent with current data instance h=makeConsistent(h,d) # Print current data instance and updated hypothesis print ('Training data :',d) print ('Updated Hypothesis :',h) print() print('') print('==========================================') print('End: Hypothesis :',h)
Output :
# Begin : Hypothesis : ['phi', 'phi', 'phi', 'phi', 'phi', 'phi'] Training data : ['Sunny', 'Warm', 'Normal', 'Strong', 'Warm', 'Same', 'Yes'] Updated Hypothesis : ['Sunny', 'Warm', 'Normal', 'Strong', 'Warm', 'Same'] * * * Training data : ['Sunny', 'Warm', 'High', 'Strong', 'Warm', 'Same', 'Yes'] Updated Hypothesis : ['Sunny', 'Warm', 'any', 'Strong', 'Warm', 'Same'] * * * Training data : ['Sunny', 'Warm', 'High', 'Strong', 'Cool', 'Change', 'Yes'] Updated Hypothesis : ['Sunny', 'Warm', 'any', 'Strong', 'any', 'any']  End: Hypothesis : ['Sunny', 'Warm', 'any', 'Strong', 'any', 'any']

Divya