P12915 /PQTDTOT/ link reply
Let's have a thread for Programming Questions That Don't Deserve Their Own Thread.
P12916 link reply
I'm writing a program that uses an sqlite database. Is their a way to "prepare" queries at compile time so that they can be executed using "step" directly at runtime. I'm using the C API, in case that matters.
P12918 link reply
So something like sqlite3_prepare() but it generates the bytecode at compile time instead of runtime?
P12919 link reply
P12918
Exactly
P12956 link reply
P12916
From what I understand, sqlite3_prepare needs the database schema. Since the schema isn't changing, I thought of saving the sqlite3_stmt objects so that I can include them directly in my program. Problem: it looks like a rather complex object and I can't find a definition to know what to save.
P13018 link reply
P12916
You wouldn't want to even if you could. The query planner may use statistics about the tables when generating prepared statements. You won't be able to get any benefit from ANALYZE.
See https://sqlite.org/src/file?ci=trunk&name=src/vdbeInt.h&ln=421 you would need to change a lot to make it work.
P13044 link reply
P12916
P12919
Prepare the statement once at startup time. Then you can use it as many times as you want. And you can periodically re-prepare the statement to get the benefits P13018 mentions.
P24897 link reply
The factorial function can be defined recursively

int factorial(int n) {
if (n==0) {
return 1;
} else {
return n*factorial(n-1);
}
}

or, which is a lot better, using a for loop

int factorial(int n) {
int fact=1
for (int i=2; i<=n; i++) {
fact*=i;
}
return fact;
}


Is there any function that can be computed recursively but that cannot be rewritten using loops?
P24899 link reply
P24897 radicals
P24901 link reply
P24897
>Is there any function that can be computed recursively but that cannot be rewritten using loops?
I believe there are no such cases. You can always turn recursive algorithms into loops which use a stack to store a temporary state because recursive call is explicitly that, storing current state in a stack, proceeding with the algorithm, then later restoring the state.
>or, which is a lot [bold: better], using a for loop
Sometimes, the compiler optimizes away tail recursion by changing it into a loop. Assuming your environment does not optimize properly the call stack, the only reason recursion is slower is that it works against the optimization mechanisms in CPU's.
P24914 link reply
P24901
>You can always turn recursive algorithms into loops which use a stack to store a temporary state because recursive call is explicitly that, storing current state in a stack, proceeding with the algorithm, then later restoring the state.

I guess so. The recursive function

void dummy() {
do_whatever()
if (cond)
dummy();
return;
}

Could be rewritten (assuming a Stack structure and relevant methods exist) into

void dummy() {
Stack stack = create_stack();
do_whatever()
while (cond)
pushlocalvariables(stack);
do_whatever()
while (stack.size != 0)
restorelocalvariables(stack)
}

>the only reason recursion is slower is that it works against the optimization mechanisms in CPU's
And that there is overhead in the function call itself, and that a loop can be more readily parallelised. Speed isn't the worst problem with recursion though.

I would like to reformulate my question then, because your solution doesn't solve the main problem with recursion. That is, it can run out of stack space.
Is there any function that can be computed recursively, but cannot be rewritten using loops [bold: and without using stacks or other infinitely growing structures]?
Obviously, I wouldn't count algorithm which use an explicit stack in their recursive form.
P28058 link reply
How does gcc's "-O" option work? I've been experimenting with SIMD instructions and have implemented some image processing algorithm. This particular implementation runs around 5 times faster with my SIMD implementation than with my scalar implementation. However, using the "-O7" option on gcc, the scalar implementation runs 2 times faster than it did originally and the SIMD implementation 10 times faster.
How come gcc was able to better optimize a method that was already optimized than one that wasn't? I use an Intel CPU and only used SSE and SSE2 instructions. My first guess was that, while gcc wasn't able to vectorize the scalar algorithm, it could upgrade the SIMD version to use larger registers. However, objdump tells me that the xmm registers are still in use.


Relative execution speeds compared to unoptimized scalar :
| scalar | simd
-----------------+--------+------
no optimization | 1.0 | 5.6
-----------------+--------+------
-O7 | 2.5 | 67
P28071 link reply
Looking a bit closer at the generated assembly, the compiler is pretty dumb if it isn't told to optimize. Here is an example (using AT&T):
movb $1, -290(%rbp)
movsbl -290(%rbp), %edx
movsbl -290(%rbp), %ecx
movsbl -290(%rbp), %esi
movsbl -290(%rbp), %edi
movsbl -290(%rbp), %r8d
movsbl -290(%rbp), %r9d
movsbl -290(%rbp), %r10d
movsbl -290(%rbp), %r11d
movsbl -290(%rbp), %ebx
movsbl -290(%rbp), %r12d
movsbl -290(%rbp), %r13d
movsbl -290(%rbp), %r14d
movsbl -290(%rbp), %r15d
movsbl -290(%rbp), %eax

Rather than:
mov $1, %eax
mov %eax, %edx
mov %eax, %ecx
mov %eax, %esi
mov %eax, %edi
mov %eax, %r8d
mov %eax, %r9d
mov %eax, %r10d
mov %eax, %r11d
mov %eax, %ebx
mov %eax, %r12d
mov %eax, %r13d
mov %eax, %r14d
mov %eax, %r15d

What's more, this sequence is used to initialize a table, so it didn't even need to use this many registers. This is part of the initialization step though, so it isn't that big of a deal. What explains most of the difference in performance, I think, is that it only uses two xmm registers (out of 8) that it writes and read to and from memory every time it needs another. Weirdly enough, the compiler seems to use all the scalar registers just fine.
P43736 link reply
P24914
>doesn't solve the main problem with recursion. That is, it can run out of stack space.
This isn't the case with tail recursive functions as long as you have tail call optimization.

>Is there any function that can be computed recursively, but cannot be rewritten using loops [bold: and without using stacks or other infinitely growing structures]?
I'm not sure if I understand this question. Lambda calculus and turing machines are equivalent models of computation (Church-Turing conjecture). Each can simulate the other.
P43753 link reply
>Is there any function that can be computed recursively, but cannot be rewritten using loops [bold: and without using stacks or other infinitely growing structures]?
If I interpret "infinitely growing" as meaning
[tex: \lim_{input size \to \infty} (memory required) = \infty]
then I think most functions are like that. With addition and subtraction, we could avoid the need to use memory by adding numbers as streams, but I don't think that would work for multiplication.
P43761 link reply
How do you professionally organize a codebase? I've had some mid-sized C++ projects before and I just went on adding directories and files as I needed them, documentation was mostly just an afterthought. But now I have some ideas for a project that would surely be large(ish) with multiple compiled object files. And this got me thinking. Are there standards for building a large(ish) project? When exactly should new directories get made, or when should functions be split up to different files? Any help or pointers are much appropriated, because searching this up only gave me results of people stuck with pre-existing large codebases or indians explaining how to use directories for webdev soyware libraries.
P43767 link reply
P43761
Caveat: I've personally only done programming for a hobby and, worse, for research stuff in school, and my experience with large projects is limited to working on existing ones. But I would think that organization should happen based on the idea of separation of concerns. Component A does task A and doesn't have to worry about the details of component B which does task B. As such, the method of organization should be very dependent on what your program actually does, and blindly following generic template guides might do more harm than good. Above all, you should plan for the project to be reorganized in the future, and avoid doing stuff that locks you into a particular scheme (e.g. designs based on inheritance).
P43805 link reply
P43761
I've worked professionally and have been programming as a hobby for a very long time and P43767 is on point.
Emphasis on
>Component A does task A and doesn't have to worry about the details of component B which does task B.
because this is extremely important. This also means your components should have [bold: well defined public interfaces] (not talking OOP) because any outside use of a component is a hard dependency.
You should also really consider every inter-component dependency carefully. In other words, you don't necessarily need A to know about B even if A needs something B computes.

Don't hold professional code bases in too high esteem. A lot of it is trash and don't even get me started on legacy code. Time and other constraints and bad management really can fuck up code bases like you wouldn't believe. I've seen a lot of hobby projects with better code, organization and language knowledge than some professional code bases.

>Are there standards for building a large(ish) project?
Some templates aren't bad but they're really just recommendations and probably not an ideal fit for your specific project. There are patterns like model view controller that fit most web projects for example and ASP.Net has an MVC Application template.

>When exactly should new directories get made, or when should functions be split up to different files?

This depends entirely on the project and it's entirely up to you, but for a large web project you can split it into sub projects and areas with models/views/controllers, which is what we did and it worked quite well for us.

- Large Web Project
- Sub Project 1 - User Front End
- Home
- Models
- Views
- Controllers
- Account Management
- Models
- Views
- Controllers
- ...
- Sub Project 2 - Administration
- Home
- Moderation
- Access Controls
- ...
- Sub Project 3 - Product Management
- Home
- Products
- Discounts
- Statistics
- ...
- Common
- Utilities
- ....

While this works for large projects, it would be complete overkill for a small blog for example. Over-engineering can be just as bad or even worse than under-engineering.

Another thing is, don't be afraid to throw away code when the code base is still small and fresh and you're working through ideas. Most time spent will be figuring things out anyway. What took you initially 2 weeks to write you'll be able to re-write within a few days and fix a lot of the design problems that you encountered along the way. Better now than months down the line. Once software gets big the opposite is true. Avoid rewriting it at all costs, change it incrementally if you must. It is almost never worth rewriting large software.
P47664 link reply
I'm looking to write an interpreter inside a safe language to sandbox it. The idea is to give it a very limited set of functionality and not access any host resources directly. Functionality that isn't required isn't even going to be implemented so the attack surface should be extremely small.

What are some good safe programming languages to do this in? It also doesn't have to be particularly fast.
P47668 link reply
P47664
>The idea is to give it a very limited set of functionality and not access any host resources directly.
TCL was designed to do exactly that.
https://tcl.tk/software/plugin/safetcl.html

There might be others but javascript basically ate everyone's lunch on this one.
P47830 link reply
P47668
Neat. Thanks for mentioning safetcl. They have a paper on it which has a lot of good information on how I could implement this myself and what to pay attention to. I'll write the interpreter in Java and have each instance do only one specific thing and then connect the safe interpreters to each other in some safe manner.

Here's the paper if anyone else is interested.
https://www.usenix.org/legacy/publications/library/proceedings/usenix98/full_papers/levy/levy.pdf
The paper also references another paper titled "A Note on the Confinement Problem" by B. Lampson, which mentions ways in which a compromised but restricted program could leak information to the outside through noisy (CPU/RAM usage) or non noisy (file locking) channels. Interesting stuff.
x