Questions tagged [efficiency]
For questions about efficiency of ML/AI algorithms solving a particular problem.
26 questions
1
vote
0
answers
69
views
Optimal matrix multiplication - impact and applications
Let $A$ be integer matrix of size $n\times t$ and $B$ be integer matrix of size $t\times m$. Let max entry in absolute value be of $b$ bits in $A,B$.
If we can multiply $A,B$ in say $\leq100(n+m)tb(\...
0
votes
0
answers
26
views
Efficient way to ensure high data coverage in a stochastic minibatch sampling for GNN, while minimizing train time?
I am training a Graph Neural Network for inductive link prediction. The final objective is to predict links for unseen nodes. My neural network follows the general GraphSAGE pipeline but I have ...
1
vote
0
answers
59
views
Are there any other training methods based on this PRNG trick?
Intro
Recently, in an effort to find new & MCU-suitable training-algorithms (for a NN-library I'm developing), I came up with a trick (which I doubt I'm the first one). A ...
3
votes
1
answer
129
views
Could the efficiencies developed by DeepSeek be put to use in the larger language models making them significantly more powerful?
The legacy LLMs have so much more compute power than DeepSeek yet they are comparable. If the efficiencies of DeepSeek get applied to the models that have significantly more compute power would that ...
0
votes
1
answer
237
views
Matrix Multiplication using a Neural Network
As I understand Neural Networks they have a slow training phase (quadratic or cubic time) and a fast (linear time) inference phase.
Also, the slow training phase comes from the requirement of doing ...
0
votes
1
answer
154
views
Reference request: data efficiency of LLM pre-training
I've seen it stated multiple times that LLMs have much worse data efficiency than humans (IE require more data to reach same or worse performance), EG this Tweet by Yann LeCun, or 19:30 in this talk ...
1
vote
0
answers
56
views
Do NNs suffer from lack of efficiency in network structure and suggesting training parameters?
I am working on dynamical systems using Optimal Control theory and trying to find the connection between this field and Machine Learning. Consider a simple 2-layer Neural Network (NN) where the ...
0
votes
1
answer
83
views
Compare the efficiency of a trained ML model with a non-learning-based method for solving the same problem
If a certain task T is solved by a non-learning-based method A (let's say, an optimization-based approach). We now train a machine learning model B (let's say a neural network) on the same task.
What ...
3
votes
1
answer
471
views
A comparison of Expert Systems and Machine Learning approaches in terms of run-time-efficiency and time/space complexity
For part of a paper I am writing on Clinical Decision Support Systems (computer-aided medical decision making, e.g. diagnosis, treatment), I am trying to compare Expert Systems with systems based on ...
2
votes
0
answers
139
views
Are there any successful applications of transformers of small size (<10k weights)?
In the problems of NLP and sequence modeling, the Transformer architectures based on the self-attention mechanism (proposed in Attention Is All You Need) have achieved impressive results and now are ...
2
votes
0
answers
110
views
How to train/update neural networks faster without a decrease in performance?
I noticed that there are many studies in recent years on how to train/update neural networks faster/quicker with equal or better performance. I find the following methods(except the chips arms race):
...
4
votes
2
answers
2k
views
In the multi-head attention mechanism of the transformer, why do we need both $W_i^Q$ and ${W_i^K}^T$?
In the Attention is all you need paper, on the 4th page, we have equation 1, which describes the self-attention mechanism of the transformer architecture
$$
\text { Attention }(Q, K, V)=\operatorname{...
1
vote
0
answers
148
views
What is the efficiency of trained neural networks?
Training neural networks takes a while. My question is, how efficient is a neural network that is completely trained (assuming it's not a model that is constantly learning)?
I understand that this is ...
5
votes
2
answers
451
views
Is it possible to guide a reinforcement learning algorithm?
I have just started to study reinforcement learning and, as far as I understand, existing algorithms search for the optimal solution/policy, but do not allow the possibility for the programmer to ...
2
votes
0
answers
170
views
What is the most efficient data type to store probabilities?
In ML we often have to store a huge amount of values ranging from 0 to 1, mostly being probabilities. The most common data structure to do so seems to be a floating point? Indeed, the range of ...