All Questions
Tagged with half-precision-float python
16 questions
1
vote
1
answer
62
views
What makes `print(np.half(500.2))` differs from `print(f"{np.half(500.2)}")`
everyone. I've been learning floating-point truncation errors recently. But I found print(np.half(500.2)) and print(f"{np.half(500.2)}") yield different results. Here are the logs I got in ...
0
votes
1
answer
86
views
What is the difference, if any, between model.half() and model.to(dtype=torch.float16) in huggingface-transformers?
Example:
# pip install transformers
from transformers import AutoModelForTokenClassification, AutoTokenizer
# Load model
model_path = 'huawei-noah/TinyBERT_General_4L_312D'
model = ...
-1
votes
1
answer
2k
views
I load a float32 Hugging Face model, cast it to float16, and save it. How can I load it as float16?
I load a huggingface-transformers float32 model, cast it to float16, and save it. How can I load it as float16?
Example:
# pip install transformers
from transformers import ...
0
votes
1
answer
490
views
Is there any point in setting `fp16_full_eval=True` if training in `fp16`?
I train a Huggingface model with fp16=True, e.g.:
training_args = TrainingArguments(
output_dir="./results",
evaluation_strategy="epoch",
learning_rate=4e-5,
...
3
votes
0
answers
145
views
How to verify if the tensorflow code trains completely in FP16?
I'm trying to train a TensorFlow (version 2.11.0) code in float16.
I checked that FP16 is supported on the RTX 3090 GPU. So, I followed the below link to train the whole code in reduced precision.
...
1
vote
1
answer
947
views
Can language model inference on a CPU, save memory by quantizing?
For example, according to https://cocktailpeanut.github.io/dalai/#/ the relevant figures for LLaMA-65B are:
Full: The model takes up 432.64GB
Quantized: 5.11GB * 8 = 40.88GB
The full model won't fit ...
0
votes
0
answers
700
views
Is there a reason why a nan value appears when there is no nan value in the model parameter?
I want to train the model with FP32 and perform inference with FP16.
For other networks (ResNet) with FP16, it worked.
But EDSR (super resolution) with FP16 did not work.
The differences I found are ...
1
vote
1
answer
3k
views
Convert 16 bit hex value to FP16 in Python?
I'm trying to write a basic FP16 based calculator in python to help me debug some hardware. Can't seem to find how to convert 16b hex values unto floating point values I can use in my code to do the ...
1
vote
2
answers
2k
views
Why is it dangerous to convert integers to float16?
I have run recently into a surprising and annoying bug in which I converted an integer into a float16 and the value changed:
>>> import numpy as np
>>> np.array([2049]).astype(np....
1
vote
1
answer
514
views
Reading a binary structure in Javascript
I have a table that I am trying to read in Javascript, with data that is large enough that I would like to have it in binary format to save space. Most of the table is either numbers or enums, but ...
1
vote
1
answer
3k
views
Why does converting from np.float16 to np.float32 modify the value?
When converting a number from half to single floating representation I see a change in the numeric value.
Here I have 65500 stored as a half precision float, but upgrading to single precision changes ...
8
votes
1
answer
9k
views
tensorflow - how to use 16 bit precision float
Question
float16 can be used in numpy but not in Tensorflow 2.4.1 causing the error.
Is float16 available only when running on an instance with GPU with 16 bit support?
Mixed precision
Today, most ...
3
votes
3
answers
14k
views
fp16 inference on cpu Pytorch
I have a pretrained pytorch model I want to inference on fp16 instead of fp32, I have already tried this while using the gpu but when I try it on cpu I get:
"sum_cpu" not implemented for 'Half' torch.
...
0
votes
1
answer
2k
views
Incomplete Cholesky Factorization Very Slow
Background:
I'm doing a project for my Numerical Linear Algebra course. For this project I decided to experiment with doing incomplete cholesky factorization with half precision arithmetic and using ...
2
votes
2
answers
2k
views
how to do convolution with fp16(Eigen::half) on tensorflow
How can I use tensorflow to do convolution using fp16 on GPU? (the python api using __half or Eigen::half).
I want to test a model with fp16 on tensorflow, but I got stucked. Actually, I found that ...