My first journey for Rust
During my 10 years' programming career, I have tried to learn Java /Python/Golang/PHP/JavaScript. There is no doubt that C++ and C have been my painful memories since I first went to college. For me, learning a new programming language is often challenging because you have to break some of your grammatical expectations of the previous language, and at the same time, the entire language design logic has to be well tuned to yourself. Especially for Rust, this process is extremely painful. But when you gain something and make a little progress, you will quickly have a sense of satisfaction and achievement.
After 2 months of trying to learn from the Rust documentation, I had an ambitious idea to try to rewrite the Apache Uniffle server with Rust.
As an important optimization of the big data computing framework, Uniffle takes over the shuffle data of a large number of Spark computing tasks and provides the writing and reading of temporary data with Spark. It not only provides high-performance and high-stability, but also need to reduce the memory resource consumption as much as possible. Because of the terrible GC problem of GRPC Java, in the Java world, Netty is often used to avoid GC. (In the 2024Q2, Netty has been stable feature in Uniffle 0.9.0 release)
But this time, because I am full of longing for Rust's GC-free language and performance, I also hope that it can give me the basic components for future big data. With more imagination, I started to rewrite Apache Uniffle's java based shuffle server, which is also the beginning of this journey.
Rewriting the Uniffle server is a challenging task. The whole rust project includes GRPC protocol support and various types of persistence storage support, asynchronous/synchronous coordination, and more.
GRPC support
I quickly find that the many creative and sharing people are grouped with the Rust community. From the community, I was recommended the tonic to implement the grpc interface. And thanks to good module tool design(yes, it's Cargo
), I quickly used tonic. But because tonic does not support the use of bytes crate for protobuf bytes, we have to use some workarounds ways.
Similarly, compared to Java, the lack of tonic documentation and examples prevents me from understanding how to implement monitoring methods for grpc, including network connections,throughput, etc. Fortunately, everything is grasping the key points, find some projects that also use Tonic to learn its in-depth usage.
Anyway, this is a great project for me!
Support for multiple storages
Uniffle's vanilla shuffle server needs to write some large block data to HDFS. Unfortunately, there is no one that is easy to use and complete HDFS client in the rust world. But, Apache Opendal provides a basic version of the JNI HDFS client, because my work Integrated in the field of big data infrastructure, I can also quickly use opendal to access hdfs. For those who have not been exposed to Hadoop, the complicated configuration will take a lot of time.
Thanks to the open source community, I quickly implemented support for HDFS.
At the same time, I also contributed some pull requests to give back to the open source community.
(After writing this article 5 month, I found the newest native hdfs client and replace the previous JNI HDFS client, it works better)
Tokio's journey
As far as I know, tokio is already a widely used asynchronous implementation in the Rust community. Candy brings happiness, but makes you fat. After using tokio as my asynchronous executor, I also successfully ran through the demo with a small amount of data.
But when I use the rust version of uniffle (I call it riffle
) to compare with Java uniffle, I find that the performance is very poor that drops by more than 2 times. As I started a long journey of investigation and performance improvement, I think this also allows me to have a deeper understanding of some features of rust and then evolved to the industrial level.
The RUST implementation went a step further.
First, I enabled cpu profiling and used tikv's jemalloc
package, which can expose associated cpu counters and draw flame graphs.
This part is very much thanks to another small partner, who implemented this feature.
But from the flame graph, I didn't see the bottleneck of related performance, so unfortunately I continued troubleshooting the problem, and it degenerated to the most basic way of logging. In the process, let me guess that it is Tokio's scheduling performance problem,
because when Riffle receives writes, the concurrency is as high as 4000-5000, the read and write pressure is very high.
I've been looking for a painless tool to confirm my hunch. tokio console
is one option, from its dashboard, I found some problems, but I have no way of knowing which crate or code segment is causing it, which makes me very worried.
After searching on google, I locked a total, await tree can expose the wait time of an async execution, this is so important to me to be able to probe every grpc request where the problem is, on which await code section is executing very slowly.
After using it, the problem was immediately exposed. The tokio mutex was causing too many context switches and the performance was very bad. After replacing the code with std::mutex
, the performance is obviously improved. This is a big trap for beginners, asynchronous lock. It doesn't make you deadlock, but it brings serious performance problems under high concurrency. In the future, I will also use other tools and develop more metrics to further improve performance.
Conclusion
It's been a really interesting journey, not only implementing an ambitious little project from 0 to 1, but also giving me insight into different performance profiles.
Finally, I would like to give a few development suggestions for beginners that I have gained during this journey of rust
- Don't stare at complicated macros and bitter ownership, as well as advanced features like pins, try to write a project that is the most important.
- Do not use tokio mutex in key performance areas.
- Do not use nested dashmap structure, it is also the cause of performance loss of flamegraph
Attached is my rust journey harvest: https://github.com/zuston/riffle