InAWS in Plain EnglishbyRam VegirajuDeploying Transformers ONNX Models on Amazon SageMakerAchieve High Scale Performance Utilizing Triton Inference Server With SageMaker Real-Time InferenceMar 13, 20241Mar 13, 20241
InTDS ArchivebyRam VegirajuAn Introduction To Fine-Tuning Pre-Trained Transformers ModelsSimplified utilizing the HuggingFace trainer objectFeb 17, 2024Feb 17, 2024
InTDS ArchivebyRam VegirajuDeploying Large Language Models With HuggingFace TGIAnother way to efficiently host and scale your LLMs with Amazon SageMakerJul 14, 20231Jul 14, 20231
InTDS ArchivebyRam VegirajuDeploying LLMs On Amazon SageMaker With DJL ServingDeploy BART on Amazon SageMaker Real-Time InferenceJun 7, 2023Jun 7, 2023
InTDS ArchivebyRam VegirajuHuggingFace Inference EndpointsRapid production-grade deployment of Transformers modelsDec 23, 20221Dec 23, 20221
InTDS ArchivebyRam VegirajuDeploy HuggingFace Models on Amazon SageMaker under 10 MinutesUse the HuggingFace Hub to Deploy NLP Models at ScaleOct 6, 20211Oct 6, 20211
InTDS ArchivebyRam VegirajuHost ML Apps with HuggingFace SpacesExploring the New HuggingFace ReleaseNov 7, 2021Nov 7, 2021