Authors - Abhijit Dnyaneshwar Jadhav, Prashant G. Ahire, Madhuri Hiwale Abstract - In vitro fertilization (IVF) is currently one of the most powerful assisted reproductive technologies for infertility treatment. However, the embryo selection process still represents a bottleneck that greatly influences the rates of implantation and live birth. Traditional methods of embryo evaluation involve embryo morphology grading. But this approach suffers from subjectivity, variability, and heavily depends on the skill and experience of the embryologist. To go beyond the limitations of human assessment, the latest improvements in artificial intelligence (AI), machine learning (ML), and deep learning (DL) have made possible the automated embryo evaluation using pictures, time-lapse morphokinetics, and clinical data. This paper reviews comprehensively the currently available AI-enabled IVF systems while also first introducing the conventional embryo assessment and later presenting the most sophisticated multimodal deep learning frameworks. The paper also discusses some of the major outstanding issues such as the poor performance of models on new datasets, the lack of the shared and agreed upon benchmarks, and the limited explainability of the models. We have also developed a Multimodal Explainable Artificial Intelligence Frame-work for IVF (MEAIF-IVF) to fill in these gaps in which image of the embryo, time-lapse video of the embryo, and clinical patient information are all combined into one deep learning model. This system uses convolutional neural networks and vision transformers for spatial feature extraction, recurrent neural networks for temporal modeling, and attention-based fusion for multimodal integration.