search
HomeBackend DevelopmentGolangThe author of Go rqlite tells you: How important algorithms are when developing database software!

This article is introduced by the golang tutorial column about "Go rqlite author tells you: How important is the algorithm when developing database software!" 》, I hope it will be helpful to friends in need!

Writing database programs is a fascinating job. I've been heavily involved in open source database development for the past two years, and database programming is probably the most inspiring project you can do as a software developer.

What is truly shocking, however, is how much my attitude toward databases has changed over the past 6 years. From being uninterested at the beginning, I now begin to think that database systems are the pinnacle of software engineering.

Don't Know What's Better

For most of my career, my only experience with databases was reading about them. Usually in a boring context - open any undergraduate textbook on databases and you'll see what I mean. Usually you will see the following table as a typical use case for relational databases:

ID FIRST LAST TITLE DEPARTMENT
1 Robert Kelly Director Marketing
2 Tom Burke Representative Sales
3 John #Smith Vice President Sales

Can you read more boring stuff? If these are all about databases, I want nothing to do with them. What's the point? Software is much cooler than this, right? So I completely avoided anything to do with databases for a long time

You never forget your first CRUD application

2009, after years of writing Embedded software, Linux device drivers, and networking software, I found myself leading a team that needed to build a web-based system. You see, the AWS cloud has arrived, and cloud-based licensing technology MAC addresses are no longer valid. My team has to build a licensing portal for our new EC2-based software appliance. Since we had a lot of experience with Python, we chose Django, running on MySQL. Something new happened. I actually started working on the database.

As our CRUD applications continued to run in the plains of our country, I began to realize how important the database was - and how central it was to our systems. If we lose the database, our software development is in vain. If the database corrupts data, our customers' devices may become unlicensed and their networks will cease to function. If the database does not function properly, thousands of people will be affected simultaneously. But none of these things happened. Databasealways works. It never disappoints us. I'm impressed.
Later I discovered foreign key constraints, unique constraints, referential integrity, indexes, (remember, at this time I don’t know anything about these things) - the database can help me in various ways to build a more robust system. I finally realized that modern databases are amazing-Databases are the most boring things in the worlduntil you actually have to build a system with them.

You will also never forget your first search system

By 2012, I was leading a team that built a large key-value database based on a large indexing and search system , with elasticsearch at its core. It's eye-opening to see what a system like Elasticsearch can do - a technology built on world-class indexing - even with terabytes of log data underneath.
By now I've seen even databases and search systems fail, but I'm fascinated by database technology. By 2014, I joined a small dedicated team developing the core of [open source time series database] (github.com/influxdata/influxdb).

The algorithms I learned

are really Very important

Only in database development can Big O analysis really come alive. Databases are one of the few applications where programmers still need to loop, sort, and filter millions of objects. This is one of the few places where a lot of the boring material learned in CS classes is important.

This is not the case with many other software developments. Writing boot ROM firmware? No, algorithms have never been important to me. Tuner device driver? No, it doesn't matter. Network device management software? CRUD application? Hardly all of these disciplines require different skills and knowledge. Most of the time, I just discussed runtime complexity in interviews.
But with the development of databases, all this has changed. It's a wonderful thing to actually see a system return the correct results, but only for a fraction of the time due to algorithm changes, and to see it happen in your code, in the system you built. matter.

Performance Matters, Too

There’s an old story in software that goes like this: A programmer writes some code that runs ten times faster than the previous version. He showed it, but someone pointed out that the data it produced was slightly different than the correct data. "But it's ten times faster," the programmer pointed out. "Well, if it doesn't need to be correct, I can make a version that takes up no space at all and runs infinitely fast," replied another.
This morality tale has always had a great impact on me. Being right is always more important than anything else. This is real. But it also leads me to believe that projects are valuable simply because they produce the right results.

For databases, this is not the case.
Performance is more than just a feature. This is a requirement. Those who are willing to pay for databases often do so because they have large amounts of data. If the database doesn't perform well in this situation—if it doesn't return results quickly and efficiently—then it might not work at all.

Do you think writing a system is complicated?

I think the thing that shocks me most about developing databases is how complex query engines have become. I have a lot of experience building systems that write and store data to disk. Making these systems work well can be a significant challenge.
But this complexity is usually much less than that of the query engine. A flexible query system - effectively building a system to answer questions when you don't know what the questions will be - requires serious design thinking. The query planner must be valid. Query systems must support many orthogonal requirements—filtering by certain dimensions, grouping by other dimensions, joining data from different tables—and sometimes supporting data from external sources. Finally, the query system must be efficient and perform well. This leads to a tension between abstraction and optimization in design and implementation, which requires real skill to manage well.

In the real world, it must be operated

Any important database must support basic operations such as backup, recovery, fragmentation management, and monitoring.
If I, as a serious operator, can't back up your database, I can't use it, simple as that. It doesn't matter how quickly the database accepts writes. During a query, it doesn't matter how small its memory footprint is. If I can't protect the data in the database from failures beyond your control, the creator of the database, I will never be able to run it comfortably.
Of course, there are many ways to back up the database without the cooperation of the database. But built-in methods are usually best. This is also my recommendation for rqlite v2.0. If I want anyone to use rqlite seriously, I have to solve the real world problem where the system can fail completely and lag behind data for a long time.

Therefore, when designing and implementing a database, build operational support from the beginning. Make it a fundamental part of your design. Your users will thank you for it.

The answer is usually "it depends"

When you first start working with a database, especially as an operator, you often ask the question: Can the system What rate index? How quickly does it respond to queries? How much disk space do I need? How big can a piece of debris be and still work? How can I speed it up? All asked without reservation. I used to make it myself.
Maybe you can talk to the database programmers and ask them these questions. And the answer you'll often—perhaps ever—get is: It's up to you. You have to benchmark, you have to measure. This can be irritating to hear and may seem like you are avoiding responsibility.

but it is not the truth.

Now, when I hear questions like this, I smile. too naive.
Indexing rate may depend on the size of the data, not just the number of documents or data points. This may depend on the batch processing, the cardinality of the data, whether the database is clustered, which columns and fields in the data are indexed, whether it is new data or an update to existing data, the machine the database is running on, RAM, IO performance, and the replication used.
The variables that control performance never end.
For queries, it may depend on the time range of the time series data. It depends on the number of records hit, the number of fields queried, whether a range scan is involved, whether the data is indexed, the type of index used, the number of shards that may be accessed, and whether the data is local. and machine characteristics. Is it in stock? Is it undergoing maintenance? Is the network busy?

So the answer is always,

It depends. Database designers are honest. They can know everything about the system they built and still not know the answers to your questions.

Programming Bucket List

If there is one piece of advice for developers who want to improve their programming skills, it would be to join a database development team. My programming skills have improved tremendously because of database development - it's been a wonderful coding experience.

Original address: https://www.philipotoole.com/what-i-learned-from-programming-a-database/

Translation address: https://learnku .com/go/t/64605

The above is the detailed content of The author of Go rqlite tells you: How important algorithms are when developing database software!. For more information, please follow other related articles on the PHP Chinese website!

Statement
This article is reproduced at:learnku. If there is any infringement, please contact admin@php.cn delete
go语言有没有缩进go语言有没有缩进Dec 01, 2022 pm 06:54 PM

go语言有缩进。在go语言中,缩进直接使用gofmt工具格式化即可(gofmt使用tab进行缩进);gofmt工具会以标准样式的缩进和垂直对齐方式对源代码进行格式化,甚至必要情况下注释也会重新格式化。

聊聊Golang中的几种常用基本数据类型聊聊Golang中的几种常用基本数据类型Jun 30, 2022 am 11:34 AM

本篇文章带大家了解一下golang 的几种常用的基本数据类型,如整型,浮点型,字符,字符串,布尔型等,并介绍了一些常用的类型转换操作。

一文浅析Golang中的闭包一文浅析Golang中的闭包Nov 21, 2022 pm 08:36 PM

闭包(closure)是一个函数以及其捆绑的周边环境状态(lexical environment,词法环境)的引用的组合。 换而言之,闭包让开发者可以从内部函数访问外部函数的作用域。 闭包会随着函数的创建而被同时创建。

go语言为什么叫gogo语言为什么叫goNov 28, 2022 pm 06:19 PM

go语言叫go的原因:想表达这门语言的运行速度、开发速度、学习速度(develop)都像gopher一样快。gopher是一种生活在加拿大的小动物,go的吉祥物就是这个小动物,它的中文名叫做囊地鼠,它们最大的特点就是挖洞速度特别快,当然可能不止是挖洞啦。

一文详解Go中的并发【20 张动图演示】一文详解Go中的并发【20 张动图演示】Sep 08, 2022 am 10:48 AM

Go语言中各种并发模式看起来是怎样的?下面本篇文章就通过20 张动图为你演示 Go 并发,希望对大家有所帮助!

tidb是go语言么tidb是go语言么Dec 02, 2022 pm 06:24 PM

是,TiDB采用go语言编写。TiDB是一个分布式NewSQL数据库;它支持水平弹性扩展、ACID事务、标准SQL、MySQL语法和MySQL协议,具有数据强一致的高可用特性。TiDB架构中的PD储存了集群的元信息,如key在哪个TiKV节点;PD还负责集群的负载均衡以及数据分片等。PD通过内嵌etcd来支持数据分布和容错;PD采用go语言编写。

【整理分享】一些GO面试题(附答案解析)【整理分享】一些GO面试题(附答案解析)Oct 25, 2022 am 10:45 AM

本篇文章给大家整理分享一些GO面试题集锦快答,希望对大家有所帮助!

聊聊Golang自带的HttpClient超时机制聊聊Golang自带的HttpClient超时机制Nov 18, 2022 pm 08:25 PM

​在写 Go 的过程中经常对比这两种语言的特性,踩了不少坑,也发现了不少有意思的地方,下面本篇就来聊聊 Go 自带的 HttpClient 的超时机制,希望对大家有所帮助。

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
2 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
Repo: How To Revive Teammates
1 months agoBy尊渡假赌尊渡假赌尊渡假赌
Hello Kitty Island Adventure: How To Get Giant Seeds
4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Atom editor mac version download

Atom editor mac version download

The most popular open source editor

ZendStudio 13.5.1 Mac

ZendStudio 13.5.1 Mac

Powerful PHP integrated development environment

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Dreamweaver Mac version

Dreamweaver Mac version

Visual web development tools