DataHungry Documents
  • Welcom to DataHungry Documents
  • Library
    • Apache Airflow
    • Apache Iceberge
    • Bardapi
    • Binance
    • Databricks
    • Datetime
    • dotenv
    • FastAPI
    • Logging
    • Langchain
    • Minio (S3)
    • MLflow
    • OpenCV
    • Optuna
    • os
    • Pyiceberg
    • Pyspark
    • Pytest
    • Schedule
    • Sklearn & SHAP
    • SQLAlchemy
    • transformers (huggingface)
    • Firebase Firestore
  • Course
    • Web Scraping
    • Streamlit
    • NLP
  • Utility
    • Docker
    • Google Sheet
  • SQL
    • Basic SQL Statements
    • PL/SQL
    • Stored Procedure & Function
  • Scala
    • Setup
    • Spark
  • Cloud
    • AWS
    • Google Cloud
Powered by GitBook
On this page
  1. Library

transformers (huggingface)

requirements.txt
transformers
protobuf==3.20.0
xformers

pip install -r requirements.txt

from transformers import pipeline
classifier = pipeline("sentiment-analysis")
res = classifier("Welcome to my Machine Learning course")
# res = [{'label': 'POSITIVE', 'score': 0.9995331764221191}]
from transformers import pipeline
generator = pipeline("text-generation", model="distilgpt2")
res = generator(
    "In this course, I will teach you how to",
    max_length=30,
    num_return_sequences=2
)
# res = [{'generated_text': 'In this course, I will teach you how to learn how to master using HTML5. Using HTML5 is great, but there are a few other'},
# {'generated_text': 'In this course, I will teach you how to do it. If you wish to learn more about it then you can read the course here :2'}]
from transformers import pipeline
from transformers import AutoTokenizer, AutoModelForSequenceClassification

model_name = "distilbert-base-uncased-finetuned-sst-2-english"
model = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

seq = "Using a transformer network is simple !!"
res = tokenizer(seq)
# res = {'input_ids': [101, 2478, 1037, 10938, 2121, 2897, 2003, 3722, 999, 999, 102], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]}

tokens = tokenizer.tokenize(seq)
# tokens = ['using', 'a', 'transform', '##er', 'network', 'is', 'simple', '!', '!']

ids = tokenizer.convert_tokens_to_ids(tokens)
# ids = [2478, 1037, 10938, 2121, 2897, 2003, 3722, 999, 999]

decoded_string = tokenizer.decode(ids)
# decoded_string = 'using a transformer network is simple!!'
PreviousSQLAlchemyNextFirebase Firestore

Last updated 1 year ago