this post was submitted on 22 Jun 2023
5 points (100.0% liked)
Learn Machine Learning
524 readers
1 users here now
Welcome! This is a place for people to learn more about machine learning techniques, discuss applications and ask questions.
Example questions:
- "Should I use a deep neural network for my audio classification task?"
- "I'm working with a small dataset, what can I do to make my model generalize well?"
- "Is there a library available that implements function X in language Y?"
- "I want to learn more about the math behind machine learning technique A, where should I start?"
Please do:
- Be kind to new people
- Post guides and tutorials that you find helpful
- Link to open/free sources instead of paywalled when possible
Please don't:
- Post news articles / memes (there are other machine learning/AI communities for this)
Other communities in this area:
- [email protected]
- [email protected]
- [email protected]
- [email protected]
- [email protected]
- [email protected]
Similar subreddits: r/MLquestions, r/askmachinelearning, r/learnmachinelearning
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Original answer: (credit @Mercury)
Late answer, but worth posting for reference. Quoting from comments of the OP:
Naive / Straightforward Approach
Each row in A is convolved with each respective row in B, essentially convolving M 1D arrays/vectors.
No Loop + CUDA Supported Version It is possible to replicate this operation by using PyTorch's F.conv1d. We have to imagine A as a 4-channel, 1D signal of length 10. We wish to convolve each channel in A with a specific kernel of length 20. This is a special case called a depthwise convolution, often used in deep learning.
Note that torch's conv is implemented as cross-correlation, so we need to flip B in advance to do actual convolution.
Advantages of using a depthwise convolution with torch:
No loops! The above solution can also run on CUDA/GPU, which can really speed things up if A and B are very large matrices. (From OP's comment, this seems to be the case: A is 10GB in size.) Disadvantages:
Overhead of converting from array to tensor (should be negligible) Need to flip B once before the operation>>