Long short-term memory (LSTM) is a popular learning model and a special type of recurrent neural network architecture for handling complex models in deep learning. The advantage of LSTM over RNN are vanishing gradient problem solving and being relatively sensitive to gap length. The main significance of LSTM is capable of learning and memorizing the long-term temporal dependencies accurately. LSTM networks perform well in time series data application tasks such as classification, processing, and making predictions. LSTM architecture comprises the memory block in the recurrent hidden layers, and each memory block contains an input gate: controls the input flow activation into the memory cell, output gate: control the flow of output to the next cell of the network and forget gate: forgetting or resting the memory of the cell for the removal of irrelevant information and cell state updation. LSTM architecture is categorized based on cell representation and attention mechanisms, such as bidirectional LSTM, hierarchical and attention-based LSTM, LSTM autoencoder, convolutional LSTM, grid LSTM, and cross-modal and associative LSTM. LSTM networks also achieve superior performance in handling sequence to sequence modeling problems. Applications areas of LSTM are language modeling, speech-to-text transcription, machine translation, Speech recognition, handwriting recognition, Image Captioning, Question Answering Chatbots, music composition, pharmaceutical development, and robot control. Future enhancements of LSTM networks are LSTM models for large text compression, solar flare prediction incorporated with image data, the combination of CNN and LSTM models for fault diagnosis in wind turbines, hybrid model for energy consumption with analysis of power consumption attributes using LSTM, and many more.