Stock Price Prediction with LSTM

Hands-On Project: Stock Price Prediction with LSTM Neural Networks

Difficulty Level: Intermediate | Duration: 4-6 hours | Topics: Deep Learning, Time Series, Neural Networks

Learning Objectives

Understand time series data and its characteristics
Learn how LSTM (Long Short-Term Memory) networks work
Build and train a neural network for stock prediction
Evaluate model performance and make predictions
Deploy the model for real-world use

Prerequisites

Python 3.8 or higher
Basic understanding of neural networks
Familiarity with Pandas and NumPy
Jupyter Notebook or Google Colab

Step 1: Setup & Installation

1.1 Create a Python Virtual Environment


# Create virtual environment
python -m venv stock-prediction
source stock-prediction/bin/activate  # On Windows: stock-predictionScriptsactivate

# Install required packages
pip install numpy pandas scikit-learn tensorflow keras matplotlib yfinance

1.2 Import Libraries


import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error, mean_absolute_error
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import Dense, LSTM, Dropout
from tensorflow.keras.models import Sequential
import yfinance as yf
import warnings
warnings.filterwarnings('ignore')

# Display settings
plt.style.use('seaborn-v0_8-darkgrid')
pd.set_option('display.max_columns', None)

Step 2: Data Collection & Preparation

2.1 Download Stock Data


# Download historical stock data (e.g., Apple stock)
ticker = "AAPL"
start_date = "2020-01-01"
end_date = "2024-11-19"

# Fetch data from Yahoo Finance
data = yf.download(ticker, start=start_date, end=end_date)
print(f"Data shape: {data.shape}")
print(f"nFirst few rows:n{data.head()}")
print(f"nData info:n{data.info()}")

2.2 Prepare Features & Scale Data


# Use only closing price
close_price = data['Close'].values.reshape(-1, 1)

# Normalize the data to range [0, 1]
scaler = MinMaxScaler(feature_range=(0, 1))
scaled_data = scaler.fit_transform(close_price)

# Visualize original vs scaled data
plt.figure(figsize=(14, 5))
plt.subplot(1, 2, 1)
plt.plot(close_price)
plt.title('Original Stock Price')
plt.xlabel('Days')
plt.ylabel('Price ($)')

plt.subplot(1, 2, 2)
plt.plot(scaled_data)
plt.title('Normalized Stock Price')
plt.xlabel('Days')
plt.ylabel('Normalized Value')
plt.tight_layout()
plt.show()

2.3 Create Training & Testing Sets


# Split data: 80% train, 20% test
train_size = int(len(scaled_data) * 0.8)
train_data = scaled_data[:train_size]
test_data = scaled_data[train_size:]

print(f"Train size: {train_size}, Test size: {len(test_data)}")

# Create sequences for LSTM
def create_sequences(data, lookback=60):
    X, y = [], []
    for i in range(lookback, len(data)):
        X.append(data[i-lookback:i, 0])
        y.append(data[i, 0])
    return np.array(X), np.array(y)

lookback = 60  # Use last 60 days to predict next day
X_train, y_train = create_sequences(train_data, lookback)
X_test, y_test = create_sequences(test_data, lookback)

print(f"X_train shape: {X_train.shape}, y_train shape: {y_train.shape}")
print(f"X_test shape: {X_test.shape}, y_test shape: {y_test.shape}")

# Reshape for LSTM [samples, timesteps, features]
X_train = X_train.reshape((X_train.shape[0], X_train.shape[1], 1))
X_test = X_test.reshape((X_test.shape[0], X_test.shape[1], 1))

Step 3: Build & Train LSTM Model

3.1 Create LSTM Model Architecture


# Build LSTM model
model = Sequential([
    LSTM(units=50, activation='relu', input_shape=(lookback, 1), return_sequences=True),
    Dropout(0.2),

    LSTM(units=50, activation='relu', return_sequences=True),
    Dropout(0.2),

    LSTM(units=25, activation='relu'),
    Dropout(0.2),

    Dense(units=1)
])

# Compile model
model.compile(optimizer='adam', loss='mean_squared_error', metrics=['mae'])

# Print model summary
print(model.summary())

3.2 Train the Model


# Train the model
history = model.fit(
    X_train, y_train,
    epochs=50,
    batch_size=32,
    validation_split=0.2,
    verbose=1
)

# Plot training history
plt.figure(figsize=(14, 5))
plt.subplot(1, 2, 1)
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.title('Model Loss Over Epochs')
plt.xlabel('Epoch')
plt.ylabel('Loss (MSE)')
plt.legend()
plt.grid(True)

plt.subplot(1, 2, 2)
plt.plot(history.history['mae'], label='Training MAE')
plt.plot(history.history['val_mae'], label='Validation MAE')
plt.title('Model MAE Over Epochs')
plt.xlabel('Epoch')
plt.ylabel('MAE')
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()

Step 4: Evaluate Model Performance

4.1 Make Predictions


# Make predictions on test data
train_predictions = model.predict(X_train)
test_predictions = model.predict(X_test)

# Inverse transform to get actual prices
train_predictions = scaler.inverse_transform(train_predictions)
y_train_actual = scaler.inverse_transform(y_train.reshape(-1, 1))

test_predictions = scaler.inverse_transform(test_predictions)
y_test_actual = scaler.inverse_transform(y_test.reshape(-1, 1))

4.2 Calculate Metrics


# Calculate performance metrics
train_rmse = np.sqrt(mean_squared_error(y_train_actual, train_predictions))
train_mae = mean_absolute_error(y_train_actual, train_predictions)

test_rmse = np.sqrt(mean_squared_error(y_test_actual, test_predictions))
test_mae = mean_absolute_error(y_test_actual, test_predictions)

print("Model Performance Metrics:")
print("=" * 40)
print(f"Training RMSE: ${train_rmse:.2f}")
print(f"Training MAE: ${train_mae:.2f}")
print(f"Testing RMSE: ${test_rmse:.2f}")
print(f"Testing MAE: ${test_mae:.2f}")

4.3 Visualize Predictions


# Prepare data for visualization
# Need to add lookback-sized padding to training predictions
full_predictions = np.empty_like(scaled_data)
full_predictions[lookback:lookback+len(train_predictions)] = train_predictions
full_predictions[lookback+len(train_predictions):] = test_predictions
full_predictions[:lookback] = np.nan

# Inverse transform for visualization
full_predictions_price = scaler.inverse_transform(full_predictions)

# Plot actual vs predicted
plt.figure(figsize=(16, 6))
plt.plot(close_price, label='Actual Price', linewidth=2)
plt.plot(full_predictions_price, label='LSTM Predictions', linewidth=2, alpha=0.7)
plt.axvline(x=train_size+lookback, color='red', linestyle='--', label='Train/Test Split')
plt.title(f'{ticker} Stock Price Prediction using LSTM', fontsize=14, fontweight='bold')
plt.xlabel('Days')
plt.ylabel('Price ($)')
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

# Zoom in on test period
plt.figure(figsize=(14, 6))
test_start = train_size + lookback
plt.plot(range(test_start, len(close_price)), y_test_actual, label='Actual Price', marker='o')
plt.plot(range(test_start, len(close_price)), test_predictions, label='Predicted Price', marker='s')
plt.title(f'{ticker} Stock Price - Test Period Detail', fontsize=14, fontweight='bold')
plt.xlabel('Days')
plt.ylabel('Price ($)')
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

Step 5: Make Future Predictions

5.1 Predict Next 30 Days


# Prepare data for future predictions
last_sequence = scaled_data[-lookback:].reshape(1, lookback, 1)

# Generate predictions for next 30 days
future_days = 30
future_predictions = []

current_sequence = last_sequence

for _ in range(future_days):
    next_pred = model.predict(current_sequence, verbose=0)
    future_predictions.append(next_pred[0, 0])

    # Update sequence with new prediction
    current_sequence = np.append(current_sequence[:, 1:, :],
                                 next_pred.reshape(1, 1, 1), axis=1)

# Inverse transform predictions
future_predictions = np.array(future_predictions).reshape(-1, 1)
future_predictions = scaler.inverse_transform(future_predictions)

# Create dates for future predictions
from datetime import datetime, timedelta
last_date = data.index[-1]
future_dates = [last_date + timedelta(days=i+1) for i in range(future_days)]

print("nNext 30 Days Price Predictions:")
print("=" * 40)
for date, pred in zip(future_dates, future_predictions):
    print(f"{date.strftime('%Y-%m-%d')}: ${pred[0]:.2f}")

Step 6: Save & Deploy Model

6.1 Save the Model


# Save the trained model
model.save('stock_prediction_lstm.h5')

# Also save as SavedModel format (newer)
model.save('stock_prediction_lstm_model/')

# Save the scaler
import pickle
with open('scaler.pkl', 'wb') as f:
    pickle.dump(scaler, f)

print("Model and scaler saved successfully!")

6.2 Load and Use Saved Model


# Load the saved model
loaded_model = keras.models.load_model('stock_prediction_lstm.h5')

# Load the saved scaler
import pickle
with open('scaler.pkl', 'rb') as f:
    loaded_scaler = pickle.load(f)

# Make predictions with loaded model
predictions = loaded_model.predict(X_test)
print("Predictions made with loaded model!")

Learning Resources

Key Concepts

LSTM (Long Short-Term Memory): A type of RNN that can remember long-term dependencies
Time Series: Data points ordered by time, used for forecasting
MinMax Scaling: Normalizes data to a range [0, 1] for better neural network performance
Dropout: Regularization technique to prevent overfitting
RMSE & MAE: Metrics for evaluating regression model accuracy

Project GitHub Repository

Find similar projects on GitHub

Additional Resources

Common Issues & Solutions

Issue	Solution
Model overfitting	Increase dropout rate, reduce model size, use more training data
Poor prediction accuracy	Try different lookback periods, add more LSTM layers, increase epochs
GPU out of memory	Reduce batch size, use a smaller model, or use CPU instead
Data not available	Check internet connection, verify ticker symbol, use alternative data source

Summary & Next Steps

Congratulations! You’ve successfully built an LSTM-based stock price prediction model. Here are the next steps:

Optimize: Try different hyperparameters (layers, units, epochs)
Enhance: Add technical indicators (RSI, MACD, Bollinger Bands) to improve predictions
Deploy: Create a web app using Flask/FastAPI to serve predictions
Monitor: Track model performance over time and retrain as needed
Combine: Ensemble multiple models for more robust predictions

Note: Stock market predictions are inherently uncertain. This model should be used for educational purposes and supplementary analysis, not as the sole basis for investment decisions. Always conduct thorough research and consult financial advisors.

Stock Price Prediction with LSTM Neural Networks

📑 Table of Contents