How can I fix this error it throws? ValueError: Found input variables with inconsistent numbers of samples:[143, 426]
#split the data set into independent (X) and dependent (Y) data sets X = df.iloc[:,2:31].values Y = df.iloc[:,1].values #split the data qet into 75% training and 25% testing X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size = 0.25, random_state = 0) #scale the data (feature scaling) sc = StandardScaler() X_train = sc.fit_transform(X_train) X_train = sc.fit_transform(X_test) #Using Logistic Regression Algorithm to the Training Set classifier = LogisticRegression(random_state = 0) classifier.fit(X_train, Y_train)
and the shape of X_train, Y_train:
X_train.shape (143, 29) Y_train.shape (426,)
error msg: ValueError Traceback (most recent call last) in () 2 3 classifier = LogisticRegression(random_state = 0) ----> 4 classifier.fit(X_train, Y_train) 5 #Using KNeighborsClassifier Method of neighbors class to use Nearest Neighbor algorithm 6
2 frames /usr/local/lib/python3.7/dist-packages/sklearn/utils/validation.py in check_consistent_length(*arrays) 210 if len(uniques) > 1: 211 raise ValueError("Found input variables with inconsistent numbers of" --> 212 " samples: %r" % [int(l) for l in lengths]) 213 214
ValueError: Found input variables with inconsistent numbers of samples: [143, 426]